Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3 API Issues #13819

Open
CuriousMagpie opened this issue Feb 4, 2022 · 13 comments
Open

v3 API Issues #13819

CuriousMagpie opened this issue Feb 4, 2022 · 13 comments

Comments

@CuriousMagpie
Copy link
Member

Description

From SunSparc in Aspiring Blacksmiths Guild:

The API returns massive amounts of data with each request. I am talking about hundreds of kilobytes of data returned on every request. Most of that data has nothing do do with the requests that are made. This is extremely inefficient and causes higher latency for each request. Even if the clients require updates with each query, surely they do not need the entirety of every defined data structure in order to function properly.

If I could make a suggestion to help improve system responsiveness , it would be to rethink the way API calls are handled. Ideally, each request should have one specific purpose and the response would only contain the data requested. This would improve transmission latency as well as processing time needed to parse responses.

I am unaware if this has been discussed before, and perhaps the architects are already aware and are taking this into consideration for future iterations of the backend code.

I've asked SunSparc for an example of an API call where this happens, so we have a place to start investigating this.

@SunSparc
Copy link
Contributor

SunSparc commented Feb 5, 2022

For a quick example of this, we can call the "Get challenges for a user" action:

curl 'https://habitica.com/api/v3/challenges/user?page=0' -v \
 -H "Content-Type:application/json" \
 -H "x-api-user: $HABITICA_API_USER" \
 -H "x-api-key: $HABITICA_API_KEY" \
 -H "x-client: $HABITICA_API_CLIENT"

I made this request just now and it returned 40,759 bytes of data.

This data includes 4,781 bytes of notifications. All of which have been received and read months ago. The choice to include these notifications in every request could be a different topic. But my recommendation would be, make notifications a separate request. Further, I would recommend multiple notification actions. For example:

/api/v3/notifications/new
/api/v3/notifications/[all]
/api/v3/notifications/:id:

Current I do not have any challenges. That means that all of the challenge data in the response was for "Discover Challenges". I would recommend that "Discover Challenges" should be a separate request. And that separate request should only return what needs to be displayed on the "Discover Challenges" page. Currently, the data returned contains ALL of the challenge data of each challenge. Most of that data should come from a separate request still, for example, the https://habitica.com/api/v3/challenges/:challengeId. That "Get a challenge" request could also be broken into other requests.

The point that I am trying to make is this, my request to /challenges/user did not return to me anything that I requested. I would expect to get an empty list [].

My suggestion, rethink all the endpoints and their actions from the perspective of having them each "Do One Thing And Do It Well".

It is far more efficient to have many endpoints with many actions and allow clients to do many small requests, rather than single requests that return massive amounts of data. A single network connection can be reused for multiple requests.

I hope that my feedback is helpful. I really love Habitica and have been using it for years. I hope that this feedback can start a deeper conversation about API design.

@citrusella
Copy link
Contributor

This isn't some sort of criticism of the merits of what you've said or anything (particularly not the notification thing, since I can't even think of how that data could in theory be useful for an unrelated request, contrast other parts of the user information like stats which could be useful depending on the action being called), but technically "get all challenges for user" is described in the documentation (here) as doing this:

Get challenges the user has access to. Includes public challenges, challenges belonging to the user's group, and challenges the user has already joined. Returns 10 results per page.

It also allows you to specify parameters in the query to only return ones where you're a member (member=true), or ones you own (owned=owned) or don't own (owned=not_owned).

...Or does it still return all of Discover if you use the query parameters?

@Alys
Copy link
Contributor

Alys commented Feb 5, 2022

This data includes 4,781 bytes of notifications.

At least some of those notifications will be from this issue: #10752 "notifications for 'Your Invitation has been Accepted' are not shown"

For example, SunSparc currently has has 16 notifications in total, of which 9 are GROUP_INVITE_ACCEPTED; I have 19 total, 8 GROUP_INVITE_ACCEPTED.
To be clear, these are only the type of notifications that are stored in the user's top level notifications object.
(SunSparc, I retrieved your notification data and looked at the type field, not any fields about the notification content.)

...bytes of notifications. All of which have been received and read months ago.

Apart from that issue about the GROUP_INVITE_ACCEPTED notifications, I'm not aware of any problems with old, read notifications being kept (unless of course the user has chosen to not dismiss them).
@SunSparc If you've noticed anything about notifications being kept that isn't covered in #10752, it may be worth reporting, and it's probably best to send a new report through the bug report form since it won't be directly related to this API excessive data issue (it will merely be one of the factors, and will likely need its own fix).

@SunSparc
Copy link
Contributor

SunSparc commented Feb 5, 2022

@citrusella, you are correct, the member=true does give me exactly what I would expect from a request to the https://habitica.com/api/v3/challenges/user action. I am glad that you mentioned the use of query parameters. Query parameters are important and useful tools. They can easily be misused and overused as quick fixes that eventually become permanent solutions. I think that some of our query parameters should be reimplemented using restful API design best practices.

@Alys, I am glad to see that one of the problems with the notifications response is being addressed.

Keep in mind that my reference to the /challenges endpoint, and the notifications, was a quick example to start a general conversation. I did not mean to start drilling down into specifics in order to elicit an individual solution for one of the endpoints. I guess what I am calling for is an audit. It could be valuable to audit all of our endpoints, actions, and query parameters for the purpose of making sure that they all conform with our API design master plan (do we have one of those?). As part of such an audit, an evaluation of the amounts of data returned from each request, as well as the relevance of the returned data to the intent of the request, would be illuminating.

@SunSparc
Copy link
Contributor

SunSparc commented Feb 5, 2022

If we want to get specific, 😆 we can look at the "Buy an Enchanted Armoire item" endpoint: https://habitica.com/api/v3/user/buy-armoire. A response on my account from today returns 68,847 bytes of data. This includes:

  • notifications (of course 😉 )
  • ALL of the gear that I own
  • the gear I have equipped
  • the costume I am wearing
  • ALL of my pets
  • eggs
  • mounts
  • quests
  • a slew of other things

In my opinion the Armoire should have a separate endpoint and not be verb'ed as an action under the /user endpoint. I would say that the only thing which should be returned from the /buy-armoire request would be similar to this:

{
  "type": "food",
  "dropKey": "Milk",
  "dropArticle": "",
  "dropText": "Milk",
  "value": 0
}

We could debate about whether to put the message field in there as well. As a side note, the message field should not contain any HTML. That is just fraught with all manner of peril. 😱

@probaton
Copy link
Contributor

probaton commented Feb 9, 2022

I cannot begin to express how much I agree with this ticket, but the pedant in me feels obligated to point out that many of the potential changes @SunSparc is recommending should probably come with an API version bump. Experience has taught me that, however weird and arcane it may be, there will always be someone relying on the extraneous data in an API response.

@citrusella
Copy link
Contributor

citrusella commented Feb 9, 2022

I think they're working on version 4 of the API right now? Maybe? (So maybe this is the perfect time to bring it up? 🤷‍♀️ )

@SunSparc
Copy link
Contributor

SunSparc commented Feb 10, 2022

@probaton, I would honestly like to hear all of your reasons for disagreeing with this ticket.

@citrusella, from what I can tell, the version 4 of the API has been in the works for a number of years now.

It is my expectation that any changes that are made as a result of this ticket would not be applied to the current API version, but to a future version. I agree that right now is a perfect time to make a full audit of the existing API and applying everything learned from that audit into making the next version a significant and welcome upgrade.

I also think that future versions of the API should conform to OpenAPI Specifications.

@probaton
Copy link
Contributor

probaton commented Feb 10, 2022

@SunSparc Not 100% sure, but I'm guessing my perspective was lost in translation when I used negative in my opening statement. The opinion I was going for was "this is an awesome idea, we need this now, it is awesome".

The only reason I posted at all (other than to express my support) is that removing response data represents a breaking change, which should come with a version bump. However, it seems that it was either implied or I missed the part where this proposal was intended for v4 of API.

As we were apparently already talking about releasing this in a new version I have exactly zero issues with your suggestion and look forward to enjoying a cleaner, faster API.

(Also +1 for conforming to OpenAPI.)

@Alys
Copy link
Contributor

Alys commented Feb 11, 2022

There's information about the API versions and API changes on the wiki at Application Programming Interface but in brief:

  • version 4 is the version that Habitica's website and apps use and it shouldn't be used for any third-party tools until it's released for public use (until that time, it may change without notice, causing any tools that use it to break)
  • version 3 is suitable for public use because it's stable - Habitica's staff don't allow any breaking changes to it, except one or twice in the past when it's been essential
  • any changes to the API that would be breaking changes are likely to be made to version 4, not version 3, so that all the existing third-party tools don't break

@CuriousMagpie
Copy link
Member Author

From another user:

I just wanted to report that I was facing an issue with the create task API where it was telling me that my task type is wrong when it seems like it should have told me that it couldn't determine the content type.
The current error message is misleading and leads me to assume that it is my data that is wrong.

The following request would fail with an error saying that "type" is wrong when that is not true.

$ curl -X POST \
                              -H 'x-client: testing' \
                              -H 'x-api-key: REDACT' \
                              -H 'x-api-user: REDACT' \
                              -d '{"text":"test","type":"todo"}' \
                              https://habitica.com/api/v3/tasks/user
{"success":false,"error":"BadRequest","message":"Task type must be one of \"habit\", \"daily\", \"todo\", \"reward\"."}

If I add the "Content-Type: application/json" header, it works fine.

@CuriousMagpie CuriousMagpie changed the title API returning excessive data v3 API Issues Feb 14, 2022
@munch-lax
Copy link

@CuriousMagpie what have you finalised for the above issues , do we need to split the apis according to the context ? and integrate it with the frontend

@CuriousMagpie
Copy link
Member Author

We're still determining what's the best approach to take here. We're a small team and these things take time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants