Enhancement: repairing latin binomials #2

jrjhealey · 2018-08-09T11:34:58Z

Hi Jaime,

Possible enhancement for you!

If pybtex doesn't already correct this, it would be good if this can also incorporate the fix for correctly italicising Latin bionomials (fairly simple search-and-replace to switch HTML italics tags, to TeX format tags. There's an old script online (below) which does essentially this, but isn't the best Python in the world...
Inspired by:

https://twitter.com/MendeleySupport/status/776001527664156672

and

https://itskathylam.wordpress.com/2016/01/12/dealing-with-italics-in-bibtex-files-exported-from-mendeley/

#!/usr/bin/python
 
# By: Kathy Lam
# Date: January 11, 2016
# Purpose: Replace all instances of "<i>" with "\textit{"
#          and "</i>" with "}" in bibtex file generated by Mendeley
 
oldbib = open("bibliography.bib", "r")
newbib = open("new_bibliography.bib", "w")
 
for line in oldbib:
    if line.startswith("title"):
        if "<i>" in line:
            fixed_open_tags = line.replace("<i>", "\\textit{")
            fixed_both = fixed_open_tags.replace("</i>", "}")
            newbib.write(fixed_both)
        else:
            newbib.write(line)
    else:
        newbib.write(line)

If there was some logic to catch and handle duplicate entries that would be really useful too (a problem I end up with quite often).

Cheers!

Joe

The text was updated successfully, but these errors were encountered:

jaimergp · 2018-08-11T14:34:49Z

Hi Joe! Thanks for the feedback.

I'd say we should regex against some common HTML code in titles (italics, subscript, and superscript, mainly). Do you have any examples at hand?

For the duplicate entries, let's create a separate issue.

jrjhealey · 2018-08-13T11:18:55Z

Yep ok good idea! I'll open another issue for duplicates.

I'll commit a folder of different examples that I come up with to my fork of the repo, and then make a PR so you can test against them too perhaps?

Currently what I've thought of are an example of:

Italicised text
Sub/superscripts (chemical formulae etc.)
Duplicated bib entries.

In my experience it's quite good at converting special characters in names etc so that's probably enough to cover 90% of the troublesome refs.

Edit:

It looks like subs/superscript might be difficult, as Mendeley (which I export my bib files from), just coerces them to normal case letters/numbers (they have no HTML around them).

jrjhealey mentioned this issue Aug 13, 2018

added examples of problematic bibs #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: repairing latin binomials #2

Enhancement: repairing latin binomials #2

jrjhealey commented Aug 9, 2018

jaimergp commented Aug 11, 2018

jrjhealey commented Aug 13, 2018 •

edited

Enhancement: repairing latin binomials #2

Enhancement: repairing latin binomials #2

Comments

jrjhealey commented Aug 9, 2018

jaimergp commented Aug 11, 2018

jrjhealey commented Aug 13, 2018 • edited

jrjhealey commented Aug 13, 2018 •

edited