Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get actual message xml packet with external code or by direct database query in MAM #4124

Open
sandeepjangir opened this issue Sep 12, 2023 · 9 comments

Comments

@sandeepjangir
Copy link

MongooseIM version: 4.0.0
Installed from: source
Erlang/OTP version: build with source code

We are using MAM with Cassandra database, and want to extract the message xml packet from nodejs code. the message column in Cassandra table is into binary data.

How we can get actual message packet with external code or by direct database query?

@arcusfelis
Copy link
Contributor

arcusfelis commented Sep 12, 2023

Hi,

Ideally you want to use

modules.mod_mam.db_message_format = "mam_message_xml"

It will write XML as XML into the DB.

Also, you could wanna tweak db_jid_format option too.

https://esl.github.io/MongooseDocs/6.0.0/modules/mod_mam/#modulesmod_mamdb_message_format

But changing it on the fly would not work (i.e. you need to start with an empty archive, all messages should be in one format, two formats would cause errors).

But if you have binary format in DB, it is probably

modules.mod_mam.db_message_format = "mam_message_compressed_eterm"

Which is Erlang External Term format https://www.erlang.org/doc/apps/erts/erl_ext_dist.html
So, there are two easy steps you need to do:

You can use some library to decode External format:

https://www.npmjs.com/package/erlang_js
or
https://github.com/mweibel/node-etf/blob/master/README.md (there could be more libraries).
(oh, I don't know if the libs can read compressed format, you probably would need to patch them in this case. But Erlang is using zlib to compress the erlang terms).

@arcusfelis
Copy link
Contributor

Or you can use graphql API to ask MongooseIM to extract messages in the reasonable format.

@sandeepjangir
Copy link
Author

modules.mod_mam.db_message_format = "mam_message_xml"

This is not available in MIM 4 version, any solution specific to MIM4

@arcusfelis
Copy link
Contributor

it is in mim 4.0.0. Module mod_mam_cassandra_arch:

expand_simple_param(Params) ->
    lists:flatmap(fun(simple) -> simple_params();
                     ({simple, true}) -> simple_params();
                     (Param) -> [Param]
                  end, Params).

simple_params() ->
    [{db_message_format, mam_message_xml}].

So, provide {simple, true} to that mod_mam_cassandra_arch.

How do you configure MAM?
Do you have any messages in Cassandra already?

@sandeepjangir
Copy link
Author

We are using MAM already with Cassandra, here is the config detail:

[modules.mod_mam_meta] backend = "cassandra" archive_chat_markers = true pm.user_prefs_store = "mnesia"

@sandeepjangir
Copy link
Author

  1. I have empty the mam_message Cassandra table
  2. updated below config for MAM in mongooseim.toml
  3. Restart the MIM server
[modules.mod_mam_meta]
  backend = "cassandra"
  archive_chat_markers = true
  db_message_format = "mam_message_xml"
  pm.user_prefs_store = "rdbms"

After doing all the setup, I'm still getting binary data in table, here is the screenshot, PFA.

Screenshot 2023-09-13 at 2 38 19 PM

@arcusfelis
Copy link
Contributor

@sandeepjangir It is

[16#3c, 16#6d, 16#65, 16#73].
"<mes"

Use

select blobastext(message) from mam_message limit 1;

@sandeepjangir
Copy link
Author

Thanks for the details @arcusfelis , I can see the raw xml now.

can you also guide me to implement a feature where I can fetch a message from message id (the message xml packet id).

MAM message table, the id is stored a unique integer value that doesn't get in message packet.

In short we need to modify message packet based on message id.

Thanks in advance

@arcusfelis
Copy link
Contributor

@sandeepjangir that is not possible.
Generally, user-generated ids are treated with a grain of salt, because it is too easy to spoof.
You can add message_id into schema and add a DB index, would require code patching in mam cassandra module.

id in schema is a MAM id, encoded as an integer.
If you have MAM id, you can find the message in the DB though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants