Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError when parsing file that line-continues onto preprocessor directive (include) #236

Open
csubich opened this issue Nov 15, 2022 · 5 comments
Assignees
Labels
bug Something isn't working parser: native Issues related with the native parser of fortls

Comments

@csubich
Copy link

csubich commented Nov 15, 2022

Describe the bug

fortls does not parse include directives when joining continued lines. This leads to incorrect parsing, including a TypeError when the line-joining – which ignores preprocessor directives and comments – reaches the end of the file.

To Reproduce

hellomain.F90:

program hello

   call hello1(1)
   call hello2(1)

end program

hellosub.F90

subroutine hello1 &
   (i)
   integer :: i
   print *, 'Hello World 1!', i
end subroutine

subroutine hello2 &
#include "hellobody.inc"

hellobody.inc:

   (i)
   integer :: i
   print *, 'Hello World 2!', i
end subroutine

Execution and error:

fortls_test$ ifort ./hellosub.F90 ./hellomain.F90 && ./a.out
 Hello World 1!           1
 Hello World 2!           1
fortls_test$ fortls --version
2.13.0
fortls_test$ fortls --debug_parser --debug_filepath ./hellosub.F90

Testing parser
  File = "./hellosub.F90"
  Detected format: free

=========
Parser Output
=========

=== PreProc Pass ===

#include "hellobody.inc" !!! Include statement(8)

!!! Parsing include file "/fs/homeu2/eccc/mrd/rpnatm/csu001/fortls_test/hellobody.inc"
!!! Completed parsing include file


=== Parsing Pass ===

subroutine hello1    (i) !!! SUBROUTINE - Ln:2
   integer :: i !!! VARIABLE - Ln:3
end subroutine !!! END SUBROUTINE Scope - Ln:5
Traceback (most recent call last):
  File "/home/csu001/.local/bin/fortls", line 8, in <module>
    sys.exit(main())
  File "/home/csu001/.local/lib/python3.9/site-packages/fortls/__init__.py", line 49, in main
    debug_server_parser(args)
  File "/home/csu001/.local/lib/python3.9/site-packages/fortls/__init__.py", line 493, in debug_server_parser
    file_ast = file_obj.parse(debug=True, pp_defs=pp_defs, include_dirs=include_dirs)
  File "/home/csu001/.local/lib/python3.9/site-packages/fortls/parse_fortran.py", line 1309, in parse
    _, line, post_lines = self.get_code_line(
  File "/home/csu001/.local/lib/python3.9/site-packages/fortls/parse_fortran.py", line 1093, in get_code_line
    if FRegex.PP_ANY.match(next_line):
TypeError: expected string or bytes-like object

Expected behavior

Correct parsing of hellosub.F90, as per:

fortls_test$ ifort -E ./hellosub.F90 > hellosub_pre.F90
fortls_test$ cat hellosub_pre.F90
# 1 "hellosub.F90"
subroutine hello1 &
   (i)
   integer :: i
   print *, 'Hello World 1!', i
end subroutine

subroutine hello2 &
# 1 "./hellobody.inc" 1
   (i)
   integer :: i
   print *, 'Hello World 2!', i
end subroutine

# 9 "hellosub.F90" 2
fortls_test$ fortls --debug_parser --debug_filepath ./hellosub_pre.F90

Testing parser
  File = "./hellosub_pre.F90"
  Detected format: free

=========
Parser Output
=========

=== PreProc Pass ===


=== Parsing Pass ===

subroutine hello1    (i) !!! SUBROUTINE - Ln:3
   integer :: i !!! VARIABLE - Ln:4
end subroutine !!! END SUBROUTINE Scope - Ln:6
subroutine hello2 # 1 "./hellobody.inc" 1  !!! SUBROUTINE - Ln:9
   integer :: i !!! VARIABLE - Ln:11
end subroutine !!! END SUBROUTINE Scope - Ln:13

=========
Object Tree
=========

2: hello1
  6: hello1::i
2: hello2
  6: hello2::i

=========
Exportable Objects
=========

2: hello1
2: hello2

Setup information (please complete the following information):

fortls_test$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.3 (Ootpa)
fortls_test$ fortls --version
2.13.0
fortls_test$ python --version
Python 3.9.7 :: Intel Corporation

Configuration information (please complete the following information):

No configuration settings used to reproduce this error.

Additional context

The narrow crash is caused by the iteration in parse_fortran.py, where line 1090 performs an unconditional get_line that returns None at the end of the file, but line 1093 assumes that the returned line is a valid object (string) for regular expression parsing.

The broad error is caused because the parser tries to assemble complete lines before any preprocessor shenanigans, whereas compilable Fortran (albeit with horrible style — the error originates from a legacy codebase) applies the preprocessor before line continuation. That means that a line can continue into (or out of!) an included file.

@csubich csubich added the bug Something isn't working label Nov 15, 2022
@gnikit gnikit self-assigned this Nov 16, 2022
@gnikit
Copy link
Member

gnikit commented Nov 16, 2022

Thanks for the report, this does not look like valid Fortran, did you make a mistake while posting the code?

subroutine hello2 &
#include "hellobody.inc"
fortls_test$ cat hellobody.inc
(i)
integer :: i
print *, 'Hello World 2!', i
end subroutine

This needs a more thorough look, I will get to it later this week.

@csubich
Copy link
Author

csubich commented Nov 16, 2022

It's two separate files. The first (hellosub.F90) contains at its end:

subroutine hello2 &
#include "hellobody.inc"

while the second (hellobody.inc) contains

(i)
integer :: i
print *, 'Hello world 2!', i
end subroutine

After the preprocessor is applied (via ifort -E, for example), the resulting fortran file is valid and compiles successfully.

The perversity tickled here is that line continuation has a lower precedence than preprocessor #include directives. In the legacy codebase that brought up this issue, these sorts of inclusions are used alongside other #defines to implement a rough and ready type of templated code.

A workaround to avoid the crash just moves the end subroutine from the template body into the declaration file (here into hellosub.F90). However, this isn't a complete fix to the issue: fortls processes the file without crashing (good!), but it doesn't pick up on the subroutine's call signature (less good).

I've also updated the initial comment in this issue to fully separate the three files in the minimum example.

@gnikit gnikit added the parser: native Issues related with the native parser of fortls label Jan 13, 2023
RPN-SI pushed a commit to ECCC-ASTD-MRD/librmn that referenced this issue Nov 7, 2023
The mock-template code that uses the C preprocessor to
define multiple versions of Interp/Extrap functions is
incompatible with Fortran linting, crashing the linter
(see fortran-lang/fortls#236)
because it tries to process the Fortran syntax without
invoking the preprocessor.

This code adjusts the templates to hold just the subroutine
body, leaving the `end subroutine` statement for the
enclosing Fortran file.  This change fixes the linter crash,
although it still doesn't properly understand the
subroutines.
RPN-SI pushed a commit to ECCC-ASTD-MRD/librmn that referenced this issue Dec 5, 2023
The mock-template code that uses the C preprocessor to
define multiple versions of Interp/Extrap functions is
incompatible with Fortran linting, crashing the linter
(see fortran-lang/fortls#236)
because it tries to process the Fortran syntax without
invoking the preprocessor.

This code adjusts the templates to hold just the subroutine
body, leaving the `end subroutine` statement for the
enclosing Fortran file.  This change fixes the linter crash,
although it still doesn't properly understand the
subroutines.
@gnikit
Copy link
Member

gnikit commented May 12, 2024

The problem here is that you are essentially trying to inject new AST nodes into a file during preprocessing.
The #include ... directive is not expanded in the fortls preproc in order to keep a 1:1 line mapping between the before and after preprocessing versions of the file. Only the defines and macro definitions are imported.

If the new content is injected in place, like in hellosub_pre.F90 then the rest of the parsing is incomplete.

@gnikit
Copy link
Member

gnikit commented May 12, 2024

I don't know which of the 2 is a better strategy to address this.

  1. Treat #include the same as Fortran include, pros: the included and includer file can be tracked (and hence updated) separately, cons: it will make the preprocessor "impure", mixing the two concepts and harder to maintain.
  2. Inject contents of #include in-place, pros: this is closer to users' expectations, cons: will have to architect a way to handle the post-processed version of the text.

@gnikit
Copy link
Member

gnikit commented May 19, 2024

This PR #399 should stop this from crashing. Still not preprocessor AST scope injection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working parser: native Issues related with the native parser of fortls
Projects
None yet
Development

No branches or pull requests

2 participants