Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import historic actions, and add to database. #38

Open
thesuperzapper opened this issue Jan 23, 2017 · 4 comments
Open

Import historic actions, and add to database. #38

thesuperzapper opened this issue Jan 23, 2017 · 4 comments

Comments

@thesuperzapper
Copy link

Is there a sensible way to import large amounts of historic actions?

Using seldon-cli import --action actions--client-name CLIENT_NAME --file-path PATH_TO_FILE imports them in some strange way that only Spark jobs can see.

@ukclivecox
Copy link
Contributor

You are free to place existing data anywhere the Kubernetes clusters can get access to.
If you want this data to be usable by the existing Spark jobs then it should respect the JSON format for actions or events. Also you should place the data in folders that mimic that required by the Spark jobs proj/year/month/day/data.

@thesuperzapper
Copy link
Author

Yea, but this dose not import them into the database. That is, you can't request a specific user's actions from the /users/{userId}/actions API endpoint.

@ukclivecox
Copy link
Contributor

The server does not store the raw actions into the relational db (MySQL) for scalability reasons. By default actions are stored into MemCache so that only recent activity is available. As an alternative you can use Redis (http://docs.seldon.io/configuration.html#redis ) to get permanent access to user actions.

At the same time actions are sent via FluentD to permanent storage for use in model building. So it depends what use case you want for the actions - model building or real time access via the API or runtime scoring.

@thesuperzapper
Copy link
Author

thesuperzapper commented Jan 25, 2017

I have 2 followup questions:

  1. As far as I can tell, seldon-cli import --action actions ..., dose not import actions into Redis/Memcached, just into static json files. If this is correct, is there a way to bulk import actions so that they could be returned by the REST API endpoint, /users/{userId}/actions, implemented here.
  2. Are you saying that after enabling Redis, as described here, you must follow the steps described here for it to work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants