Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any example on how to write numeric type into db? #273

Open
tangyan opened this issue Dec 11, 2020 · 10 comments
Open

Any example on how to write numeric type into db? #273

tangyan opened this issue Dec 11, 2020 · 10 comments
Labels
feature feature request

Comments

@tangyan
Copy link

tangyan commented Dec 11, 2020

In pg/types it seems there is no binding for numeric type. It would be nice if there is an example of how to do it. Or have something in unit tests as an example.

@thed636
Copy link
Collaborator

thed636 commented Dec 11, 2020

Hi!
Nemeric type, unfortunately, is not adapted in the library yet. But it may be adapted externally via specializaition of ozo::send_impl, ozo::recv_impl and ozo::size_of_impl. In this case it is necessary to know the binary representation of the type.

@thed636 thed636 added the feature feature request label Dec 17, 2020
@neogenie
Copy link
Contributor

neogenie commented Apr 8, 2021

It's not hard to add decimal/numeric support to the library. The main question is: will we serialize the numeric into a string, which in my opinion is more universal, but less performance, or will we use some decimal library from boost or intel? @thed636

@thed636
Copy link
Collaborator

thed636 commented Apr 9, 2021

Hi, Neo!
Well, IMO it is better to use something like boost::multiprecision::cpp_dec_float. Just to have all the math out of the box without additional extra dependencies.

@neogenie
Copy link
Contributor

neogenie commented Apr 9, 2021

@thed636
boost::multiprecision implements floating point arithmetic, which may be unacceptable for financial and accounting systems. The internal database implements true decimal, thus avoiding loss of precision. I suggest eliminating any dependencies at all and letting the library user decide for himself how to use the results. As a basic interface, we can offer conversion from / to a string and access to the internal representation, if you need to convert this type to the format of the required library without serialize/deserialize. It looks like it would be more correct in terms of division of responsibility between the library and the calling code.

I'll try to sketch out a PR with implementation.

@thed636
Copy link
Collaborator

thed636 commented Apr 11, 2021

Well, we already provide a customization point to allow to adapt any type via send_impl/recv_impl/size_of_impl customization. So users may adapt any type they want to the library. I suggest the Boost.Multiprecision type because of the one that may be used as a number with all the math out of the box. Of course, if the type is not good for a user there is an ability to use any other type, and the boost::multiprecision::cpp_dec_float adaptation will be an example of how to do it.

As a basic interface, we can offer conversion from / to a string and access to the internal representation

This is not the best idea, because a user always may cast a numeric type to a text in a query to get a text representation of numeric. So at this point, such a solution is a little bit pointless.

I just want to say that boost::multiprecision::cpp_dec_float adaptation does not deny any other types' adaptations and usage. But it is not a good solution to provide a text-like representation of numbers without any math support, because it is useless in most cases.

@neogenie
Copy link
Contributor

I agree that out-of-the-box math in boost::multiprecision::cpp_dec_float looks tempting.

But. The problem with using this library is that it is float with the specified accuracy in the template parameter. PostgreSQL implements the true decimal. The internal format of database and libpq determines the precision and number of characters on the fly in runtime, so it is not clear what to use as the internal type. Using the number of characters with a margin - can lead to an unwanted overuse of memory (albeit on the stack) and still does not guarantee that we will be able to save the corresponding number, and using a small number of characters will lead to loss of accuracy and unusability of the library.

My point would be that you can provide a basic decode / encode from a binary protocol in the lib and give the ability to serialize / deserialize this type into a string for tests, or, if we do not need to perform any calculations and give the user access to the internal representation for converting to a required type, be it a boost::multiprecision or some other library for working with decimal.

It looks like it makes no sense to add some implementation to the library and leave it to the client's choice.

@systocrat
Copy link

Hey, adding onto this issue to see if there is a workaround.

I'd like to serialize numeric types using the string format on the wire so I don't have to convert between the mpdecimal binary format and the Postgres binary format. As far as I can tell, many other popular libraries use the text representation, but it doesn't seem like ozo has a customization point which would allow me to explicitly tell libpq that a particular type is meant to be serialized/deserialized in the text rather than binary format.

Should I just dive in and start writing a proper conversion function or is there an easier way?

@tangyan
Copy link
Author

tangyan commented Jun 28, 2023

What I did to solve this problem at that moment was to copy Postgres internal binary format codes and use that as the serialization. I vaguely remembered it's about 2 - 3 files and modified some of the logic to strip out the codes.

@systocrat
Copy link

@thed636 Hello! I have a nearly working example implementation here that I'd like to post, but I'm struggling with size_of_impl for writing decimal values to the database.

The binary wire representation of a numeric looks like this:

    struct pg_numeric {
        BOOST_HANA_DEFINE_STRUCT(pg_numeric,
                                 (std::int16_t, ndigits),
                                 (std::int16_t, weight),
                                 (std::uint16_t, sign),
                                 (std::uint16_t, dscale));
    };

Followed by an array of signed shorts with length ndigits

Is there a clean way to do this?

@systocrat
Copy link

systocrat commented Jun 30, 2023

Here's a gist with my WIP implementation- It's not pretty but deserialization works with many cases I tested querying from Postgres, and serialization works with a couple of cases that I tested manually. This uses libmpdec++ rather than boost::multiprecision::cpp_dec_float as its numeric class of choice.

https://gist.github.com/systocrat/036f3fe7e91d14530247dee666d456a6

All that really needs to be implemented is the size_of_impl which I'm still unclear on how to do for types like this where I'm converting between two highly incompatible representations of numeric data.

The table used in the example is defined as follows:

CREATE TABLE public.ntable (
    id SERIAL PRIMARY KEY NOT NULL,
    num numeric
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature feature request
Projects
None yet
Development

No branches or pull requests

4 participants