Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test execution killed due to ORA-04036 when running with coverage reporters #1248

Open
adrianhj opened this issue Mar 19, 2023 · 2 comments
Open

Comments

@adrianhj
Copy link

adrianhj commented Mar 19, 2023

Describe the bug
Not 100% sure this is a bug as such or whether this is 'just Oracle', but opening for discussion on ideas/possibilities/workarounds; somewhat discussed in https://utplsql.slack.com/archives/C0PK55Z08/p1663013558850999 previously.

When executing our test suite with coverage reporters enabled the executing session is killed due to ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT:

Errors in file /opt/oracle/diag/rdbms/xe/XE/trace/XE_ora_346.trc  (incident=698) (PDBNAME=XEPDB1):
ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT
XEPDB1(3):Incident details in: /opt/oracle/diag/rdbms/xe/XE/incident/incdir_698/XE_ora_346_i698.trc

Observing memory usage of the session via a custom reporter dumping the following query result into a table via an after_calling_test hook we observe that usage is continuously growing without freeing until it hits the limits of the environment and the session is killed (~1.5GB memory used):

    SELECT
       ROUND(p.pga_used_mem/1024/1024,2) AS pga_used_mem_mb,
       ROUND(p.pga_alloc_mem/1024/1024,2) AS pga_alloc_mem_mb,
       ROUND(p.pga_freeable_mem/1024/1024,2) AS pga_freeable_mem_mb,
       ROUND(p.pga_max_mem/1024/1024,2) AS pga_max_mem_mb
    FROM   v$session s,
           v$process p
    WHERE  s.paddr = p.addr
    AND    s.AUDSID = (select sys_context ('USERENV', 'SESSIONID') from dual);

Executing the same suite without a coverage reporter the memory usage is steady at ~a few hundred MB of memory usage.

It feels like the PL/SQL Profiler is keeping all coverage data in memory throughout, leading us to the out of memory situation and session being killed.

Modifying https://github.com/utPLSQL/utPLSQL/blob/develop/source/core/coverage/ut_coverage_helper_profiler.pkb#L44 to call dbms_profiler.flush_data() seemed to produce a more saw-tooth shaped pattern of usage at the cost of now extremely slow test execution (from 10-15 minutes to hours).

We also attempted to add a 'profiler flusher' reporter which periodically invoked dbms_profiler.flush_data after each suite, but this does not seem to have achieved much:

  overriding member procedure after_calling_suite(self in out nocopy ut_profiler_flusher, a_suite ut_logical_suite) as
    l_return_code binary_integer;
  begin
    (self as ut_coverage_reporter_base).after_calling_suite(a_suite);
    ut_coverage.coverage_resume();
    l_return_code := dbms_profiler.flush_data();
    ut_coverage.coverage_pause();
  end;

Provide version info
Information about utPLSQL and Database version:

set serveroutput on
declare
  l_version varchar2(255);
  l_compatibility varchar2(255);
begin
  dbms_utility.db_version( l_version, l_compatibility );
  dbms_output.put_line( l_version );
  dbms_output.put_line( l_compatibility );
end;
/
select substr(ut.version(),1,60) as ut_version from dual;
select * from v$version;
select * from nls_session_parameters;
select substr(dbms_utility.port_string,1,60) as port_string from dual;

21.0.0.0.0
21.0.0


PL/SQL procedure successfully completed.


UT_VERSION                                                  
------------------------------------------------------------
v3.1.12.3589


BANNER                                                                           BANNER_FULL                                                                                                                                                      BANNER_LEGACY                                                                        CON_ID
-------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ----------
Oracle Database 21c Express Edition Release 21.0.0.0.0 - Production              Oracle Database 21c Express Edition Release 21.0.0.0.0 - Production                                                                                              Oracle Database 21c Express Edition Release 21.0.0.0.0 - Production                       0
                                                                                 Version 21.3.0.0.0                                                                                                                                                                                                                                          



PARAMETER                      VALUE                                                           
------------------------------ ----------------------------------------------------------------
NLS_LANGUAGE                   ENGLISH                                                         
NLS_TERRITORY                  UNITED KINGDOM                                                  
NLS_CURRENCY                   £                                                               
NLS_ISO_CURRENCY               UNITED KINGDOM                                                  
NLS_NUMERIC_CHARACTERS         .,                                                              
NLS_CALENDAR                   GREGORIAN                                                       
NLS_DATE_FORMAT                DD-MON-RR                                                       
NLS_DATE_LANGUAGE              ENGLISH                                                         
NLS_SORT                       BINARY                                                          
NLS_TIME_FORMAT                HH24.MI.SSXFF                                                   
NLS_TIMESTAMP_FORMAT           DD-MON-RR HH24.MI.SSXFF                                         

PARAMETER                      VALUE                                                           
------------------------------ ----------------------------------------------------------------
NLS_TIME_TZ_FORMAT             HH24.MI.SSXFF TZR                                               
NLS_TIMESTAMP_TZ_FORMAT        DD-MON-RR HH24.MI.SSXFF TZR                                     
NLS_DUAL_CURRENCY              €                                                               
NLS_COMP                       BINARY                                                          
NLS_LENGTH_SEMANTICS           BYTE                                                            
NLS_NCHAR_CONV_EXCP            FALSE                                                           

17 rows selected. 


PORT_STRING                                                 
------------------------------------------------------------
x86_64/Linux 2.4.xx

Information about client software
SQL Developer/utPLSQL Maven Plugin

To Reproduce
Steps to reproduce the behavior:

  1. Execute a somewhat sizeable test suite with a coverage reporter enabled (e.g. Sonar)
begin
    ut.run(ut_coverage_sonar_reporter());
end;
/
  1. Monitor memory usage of test session as it executes
  2. Observe ever-growing memory usage by the session

Repeat above comparing to a test run without a coverage reporter enabled.

Expected behavior
Test suite completes successfully without ever growing PGA usage leading to out-of-memory and session killing.

Example code
N/A

Additional context

  • ~100k LOC over ~500 objects in a single schema
  • ~4.5k tests with ~50% coverage
@jgebal
Copy link
Member

jgebal commented May 28, 2023

@adrianhj
Thanks for documenting this issue here.
I spent some time investigating problems with coverage and may also reach out to Oracle Development for help on that topic.

  • Is your code activity spanning across different schemas when you run the test suite?
    That is - even though you're running tests in schema X does the code from schema X invokes code from schemas A,B,C,D... etc?
  • Does the user invoking the tests have privileges that would allow him to run profiler on other schemas?

If the answer to both questions = yes then it may be that the coverage is gathered on all the code from all the schemas that are used on the test run.

I will need to do some local experiments to understand this but it is clearly a problem with DBMS_PROFILER and how much memory is needed to store the information.

If you see 1.5 GB mem used and your codebase is only 100k lines, it would indicate that DBMS_PROFILER
needs around 1.5kB of memory for each line of code.
So profiler must be gathering more information than you need.

I have never tried it, but maybe limiting the privileges of the user that is running the tests.
image
Link

You would need to make sure that the user who is running the tests does not have the CREATE ANY PROCEDURE privilege.

@jgebal
Copy link
Member

jgebal commented May 28, 2023

@adrianhj
Are you able to provide the graph of memory usage with & without flush functionality that you added in slack?
It would be very helpful to keep it here as slack has already hidden the conversation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants