Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for special characters #13

Open
netzdoktor opened this issue Feb 18, 2021 · 6 comments
Open

Support for special characters #13

netzdoktor opened this issue Feb 18, 2021 · 6 comments

Comments

@netzdoktor
Copy link

Thanks for this nice little crate. I'm using it indirectly, as I am creating a research website using zola.

Now I ran into the issue that I published with people that have special characters in their name. We created a bibtex like this to make sure it works with most LaTex systems.

  • {\'{e}}-> é
  • \"u -> ü

I don't want to maintain multiple files (one for Zola with ü one for LaTex with \"u), so it would be great to improve nom-bibtex.

How could we add support for this in nom-bibtex? I think I would be willing to make a PR with a bit of help.

@charlesvdv
Copy link
Owner

Hello @Darneas!

Thanks for the kind word and letting me know that zola is actually using my crate 👍

If you are willing to do a PR, I would be happy to review it and accept it. Which kind of information/help would need to contribute?

@netzdoktor
Copy link
Author

You're welcome @charlesvdv!

That's FOSS... you never know where your code is put to good use ;-)

Starting question would be where to put unit tests and implementation. parser.rs?

I guess a way forward would be that I add the above mentioned characters into unit tests and we make progress from the place where things fail (in zola, I got a parser error with in IsNot or something like that).

@charlesvdv
Copy link
Owner

It's definetely in parser.rs that you will find the bug. Regarding the unit test, there are already a bunch of tests in there. If you want to ease your testing, you can also add an integration test in the tests folder.

@maurofaccin
Copy link

Hi,
I'm also a zola user.
Expanding on @Darneas issue, it would be cool if all special symbols described on Bibtex webpage would be recognized and parsed.

In particular, I'm talking about the Special Symbols page.

I'm not sure how latex commands and math mode should be treated but there is a list of special characters as well as the protected formatting rule (words and chars between curly brackets should not be formatted, as of now curly brackets are passed by).

TBH, I'm not sure this processing belongs to the parser or to the client (in this case Zola).

@najtin
Copy link

najtin commented Oct 5, 2021

I use pylatexenc to parse parts of a huge bibtex file in python. pylatexenc is really powerful and can almost parse anything you throw at it. Unfortunately my "homemade" bibtex parser which internally uses pylatexenc is really slow. This is why i searched for a project just like nom-bibtex. Though we can not directly port the file were the magic happens: https://github.com/phfaist/pylatexenc/blob/master/pylatexenc/latex2text/_defaultspecs.py we may be able to use it to kickstart things. I am willing to put in the necessary effort in the next few weeks because i desperately need a much faster solution.

@charlesvdv
Copy link
Owner

I think special symbols should probably be handled by the library since they are specified in the specification. For math-mode, it's a bit more tricky. I would maybe support common use-cases that be easily converted into unicode codepoints. For the rest, I guess it would make more sense to let the client code handle it.

@najtin if you have some time to contribute, don't hesitate ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants