Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: `INTERNALERROR> IndexError: bytearray index out of range` #607

momonala · 2023-08-21T11:11:30Z

Summary

I get this trackback when running python pytest unit tests using the Apache Beam testing framework.

This is an intermittent issue. It often resolves if I rerun the test suite 1-3 times.

Based on the trackeback, it is related to the pytest_cov/plugin.py and related modules: The traceback shows that the error originates in the pytest_cov plugin, which is responsible for coverage reporting during the test run. This plugin interacts with the coverage library to collect and report coverage data. The error occurs during the flushing of coverage data, when coverage data collected during the test execution is being processed and saved.

All tests in the pytest output will be marked as passed. This issue occurs regardless of passing or failing tests.

I only get this issue on my test suite with Apache Beam. It does not happen on a different test suite with the same pytest/coverage versions installed, but no Beam tests/Beam package is installed. I am using a DirectRunner for Beam, and tests are running sequentially (i.e. no parallelism).

INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 270, in wrap_session
INTERNALERROR>     session.exitstatus = doit(config, session) or 0
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/_pytest/main.py", line 324, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/pluggy/_hooks.py", line 433, in __call__
INTERNALERROR>     return self._hookexec(self.name, self._hookimpls, kwargs, firstresult)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/pluggy/_manager.py", line 112, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/pluggy/_callers.py", line 133, in _multicall
INTERNALERROR>     teardown[0].send(outcome)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/pytest_cov/plugin.py", line 298, in pytest_runtestloop
INTERNALERROR>     self.cov_controller.finish()
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/pytest_cov/engine.py", line 44, in ensure_topdir_wrapper
INTERNALERROR>     return meth(self, *args, **kwargs)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/pytest_cov/engine.py", line 250, in finish
INTERNALERROR>     self.cov.save()
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/coverage/control.py", line 780, in save
INTERNALERROR>     data = self.get_data()
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/coverage/control.py", line 860, in get_data
INTERNALERROR>     if self._collector.flush_data():
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/coverage/collector.py", line 499, in flush_data
INTERNALERROR>     self.covdata.add_lines(self.mapped_file_dict(line_data))
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/coverage/sqldata.py", line [125](https://github.com/GrandperspectiveGmbH/scanfeld-pylib/actions/runs/5924117264/job/16061055736#step:10:126), in _wrapped
INTERNALERROR>     return method(self, *args, **kwargs)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/coverage/sqldata.py", line 499, in add_lines
INTERNALERROR>     linemap = nums_to_numbits(linenos)
INTERNALERROR>   File "/usr/local/lib/python3.10/site-packages/coverage/numbits.py", line 42, in nums_to_numbits
INTERNALERROR>     b[num//8] |= 1 << num % 8
INTERNALERROR> IndexError: bytearray index out of range

Expected vs actual result

The unit tests should run successfully without encountering the IndexError mentioned above. The coverage reporting process should handle data flushing reliably, regardless of the code paths being executed during the tests.

Reproducer

Versions

Python: 3.10
pytest-cov: 4.1.0
coverage: 7.3.0
apache_beam: 2.46.0
Test environment: Docker container running on top of Ubuntu (Github Actions CI machines)
Test command:

docker compose ... run \
                         --name  container_name \
                         container_name \
                         pytest  \
                         --tb=native \
                         -s \
                         --cache-clear \
                         --no-cov-on-fail \
                         --cov-config=.coveragerc \
                         --cov=pipeline \
                         --cov-report html:reports/cov-html/pipeline-unit \
                         --cov-report xml:reports/cov/pipeline-unit.xml \
                         --junitxml=reports/junit/pipeline-unit.xml

Config

.coveragerc

# .coveragerc to control coverage.py
[run]
branch = False
relative_files = True

[report]
# Regexes for lines to exclude from consideration
exclude_lines =
    # Have to re-enable the standard pragma
    pragma: no cover

    # Don't complain about missing debug-only code:
    def __repr__
    if self\.debug

    # Don't complain if tests don't hit defensive assertion code:
    raise AssertionError
    raise NotImplementedError

    # Don't complain if non-runnable code isn't run:
    if 0:
    if __name__ == .__main__.:

ignore_errors = True

omit =
    python/*
    *test_*
    *i_test_*
    *__init__.py

Code

Link to your repository, gist, pastebin or just paste raw code that illustrates the issue.

If you paste raw code make sure you quote it, eg:

def foobar():
    pass

The text was updated successfully, but these errors were encountered:

nedbat · 2023-08-21T20:47:53Z

Do you have a link to code we can run to reproduce the error?

momonala · 2023-08-22T06:43:13Z

Do you have a link to code we can run to reproduce the error?

Unfortuantey its a private repo for work and I havent been able to reproduce the error with toy tests. If Im able to Ill post the code here.

Do you have any thoughts on things I can look into or where the error might originate from, if youve seen something like this before?

nedbat · 2023-08-22T11:11:51Z

This error will happen if nums has a value less than -8. I'm not sure why you have that. If you are able to, edit nums_to_numbits to print out nums, so we can see what's in it.

Al-Minkin · 2024-02-05T12:14:53Z

I am a contributor on the same project - the nums seems to have a lot of large negative values, which are also inconsistent between separate test runs (which is why this problem only occurs sometimes, I presume - by chance negative numbers may not occur). Here are examples from two different test runs:
{1, 3, 5, 8, 13, 16, 20, 23, 29, 31, 32, 33, 36, 37, 38, 39, 40, 41, 42, 43, 45, 54, 62, 63, -190, 77, 94, 100, 107, -20, 110, -15, 123, 124, -131}
{1, 3, 5, 8, 13, 16, 18, 20, 23, 24, 29, 31, 33, 36, 37, 38, 39, 40, 41, 42, 43, -92, 45, -90, 44, -74, -52, -51, -37, -34, -160, -20, 124, 126}
Negative numbers only seem to appear when covering a specific file, contents of which I am not comfortable disclosing at this time. Any idea why this might be happening on general principles? It's a bit hard for me to understand how the line counts are calculated here.

nedbat · 2024-02-06T22:16:47Z

@Al-Minkin do the positive numbers correspond to executable line numbers in your source file? Also, if you negate the negative numbers, do they correspond to lines in your source file?

BTW: I was wrong, this isn't because you have values less than -8, it's because you have negative numbers larger (in absolute value) than your largest positive number. BUT: you shouldn't have any negative numbers in the first place.

Can you add --debug=sys to your coverage run line, and share the output?

Al-Minkin · 2024-02-15T13:43:20Z

I don't think we have a separate coverage run line - we run coverage as a module as part of pytest. I tried adding it to pytest itself, but the output wasn't meaningfully different to my eyes. We have a coveragerc config that we supply to it - what would I add to it to achieve the same result?

As for line numbers - it's hard for me to say what they correspond to because I am not sure what that line count represents. Is that cumulative line counts per test, total line counts in the file, individual line numbers that have been executed so far, or something else?

nedbat · 2024-02-15T16:09:33Z

You can add this to your coveragerc file:

[run]
debug = sys

The set of numbers that you are seeing are not line counts. They are line numbers, which is why they should not be negative. The error is happening where coverage is trying to record the set of line numbers that have been executed. It would be useful to know how the line numbers you are seeing correspond to the contents of the file.

Al-Minkin · 2024-02-27T14:26:31Z

I've looked into this issue further and I think I solved it on our end, but an upstream solution may be preferrable on your side, since I still do not really know the true cause of the error. Here are the results of my investigation:

First of all, I re-ran the test suite about 50 times, and logged all linenos passed to the nums_to_numbits function. Then I've compared them from one test run to the other. For all files except a file we'll call test_utils.py, the set of line numbers is exactly the same from one test suite run to the other.

For test_utils.py, the situation is a bit more complex. The set of linenos can be split into two subsets - let's call them "true lines" and "ghost lines". True lines correspond to the code lines in test_utils.py, are consistent from test run to test run, and include almost every code line that is expected to be covered except two (see below). Ghost lines seem to be random numbers from about -300 to 300 (the range is not exact; test_utils.py is only about 100 lines long), correspond to nothing, and if by chance a ghost line happens to be negative enough, the coverage collection fails entirely.

The key part of the file (as well as the change I have made to resolve the problem) looks like this:

def match_recursive_any_order(expected):
    """
    Custom hamcrest-based matcher that matches dicts and lists recursively,
    and floats with a tolerance. But does not care about the order of items.
    Can be used in a beam.testing.util.assert_that
    """
++  matcher = _match_recursive(list(expected))

--  def _matches(actual):
--      expected_list = list(expected)                                     <--- not covered by true lines, but is executed
--      hamcrest_assert_that(actual, _match_recursive(expected_list))      <--- not covered by true lines, but is executed

--  return _matches
++  return lambda actual: hamcrest_assert_that(actual, matcher)


def _match_recursive(expected):
    if isinstance(expected, dict):
        return has_entries({k: _match_recursive(v) for k, v in expected.items()})
    elif isinstance(expected, list):
        return contains_inanyorder(*[_match_recursive(v) for v in expected])
    elif isinstance(expected, float):
        return close_to(expected, 0.0001)
    else:
        return equal_to(expected)

After this change was made, all ghost lines vanished from the logs and the test suite coverage had stopped crashing.

The reason why this change had worked is not very clear to me, but I suspect it has something to do with the way beam pickles functions or executes matchers, which may mess with line coverage somehow. Hence, I think an upstream solution may be necessary.

nedbat · 2024-02-27T15:20:04Z

I don't see why those changes would have affected this behavior, but I can't understand how the negative line numbers happen in the first place. Is there any way you can give me a way to run your code? We can talk privately in email if needed: [email protected].

Can you create a different Apache Beam project that also demonstrates the problem?

Al-Minkin · 2024-03-05T10:30:10Z

Sadly I can't justify committing more company time to this problem, especially since it no longer breaks our builds. I just wanted to document my findings in case someone else has a similar problem in the future.

momonala changed the title ~~INTERNALERROR> IndexError: bytearray index out of range~~ Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: INTERNALERROR> IndexError: bytearray index out of range Aug 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: `INTERNALERROR> IndexError: bytearray index out of range` #607

Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: `INTERNALERROR> IndexError: bytearray index out of range` #607

momonala commented Aug 21, 2023 •

edited

nedbat commented Aug 21, 2023

momonala commented Aug 22, 2023

nedbat commented Aug 22, 2023

Al-Minkin commented Feb 5, 2024

nedbat commented Feb 6, 2024 •

edited

Al-Minkin commented Feb 15, 2024

nedbat commented Feb 15, 2024

Al-Minkin commented Feb 27, 2024

nedbat commented Feb 27, 2024

Al-Minkin commented Mar 5, 2024

Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: INTERNALERROR> IndexError: bytearray index out of range #607

Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: INTERNALERROR> IndexError: bytearray index out of range #607

Comments

momonala commented Aug 21, 2023 • edited

Summary

Expected vs actual result

Reproducer

Versions

Config

Code

nedbat commented Aug 21, 2023

momonala commented Aug 22, 2023

nedbat commented Aug 22, 2023

Al-Minkin commented Feb 5, 2024

nedbat commented Feb 6, 2024 • edited

Al-Minkin commented Feb 15, 2024

nedbat commented Feb 15, 2024

Al-Minkin commented Feb 27, 2024

nedbat commented Feb 27, 2024

Al-Minkin commented Mar 5, 2024

Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: `INTERNALERROR> IndexError: bytearray index out of range` #607

Intermittent IndexError in pytest-cov during Apache Beam Unit Tests: `INTERNALERROR> IndexError: bytearray index out of range` #607

momonala commented Aug 21, 2023 •

edited

nedbat commented Feb 6, 2024 •

edited