Ошибка ora 03113 end of file on communication channel

I have a 400 line sql query which is throwing exception withing 30 seconds

ORA-03113: end-of-file on communication channel

Below are things to note:

  1. I have set the timeout as 10 mins
  2. There is one last condition when removed resolves this error.
  3. This error came only recently when I analyzed indexes.

The troubling condition is like this:

AND UPPER (someMultiJoin.someColumn) LIKE UPPER ('%90936%')

So my assumption is that the query is getting terminated from the server side apparently because its identified as a resource hog.

Is my assumption appropriate ? How should I go about to fix this problem ?

EDIT: I tried to get the explain plan of faulty query but the explain plan query also gives me an ORA-03113 error. I understand that my query is not very performant but why should that be a reason for ORA-03113 error. I am trying to run the query from toad and there are no alert log or trace generated, my db version is
Oracle9i Enterprise Edition Release 9.2.0.7.0 — Production

asked Jul 28, 2010 at 7:01

Ravi Gupta's user avatar

Ravi GuptaRavi Gupta

4,45812 gold badges54 silver badges85 bronze badges

4

One possible cause of this error is a thread crash on the server side. Check whether the Oracle server has generated any trace files, or logged any errors in its alert log.

You say that removing one condition from the query causes the problem to go away. How long does the query take to run without that condition? Have you checked the execution plans for both versions of the query to see if adding that condition is causing some inefficient plan to be chosen?

answered Jul 28, 2010 at 12:32

Dave Costa's user avatar

Dave CostaDave Costa

47.1k8 gold badges56 silver badges72 bronze badges

2

You can safely remove the «UPPER» on both parts if you are using the like with numbers (that are not case sensitive), this can reduce the query time to check the like sentence

AND UPPER (someMultiJoin.someColumn) LIKE UPPER ('%90936%')

Is equals to:

AND someMultiJoin.someColumn LIKE '%90936%'

Numbers are not affected by UPPER (and % is independent of character casing).

answered Aug 11, 2010 at 15:50

Dubas's user avatar

DubasDubas

2,8451 gold badge24 silver badges37 bronze badges

I’ve had similar connection dropping issues with certain variations on a query. In my case connections dropped when using rownum under certain circumstances. It turned out to be a bug that had a workaround by adjusting a certain Oracle Database configuration setting. We went with a workaround until a patch could be installed. I wish I could remember more specifics or find an old email on this but I don’t know that the specifics would help address your issue. I’m posting this just to say that you’ve probably encountered a bug and if you have access to Oracle’s support site (support.oracle.com) you’ll likely find that others have reported it.

Edit:
I had a quick look at Oracle support. There are more than 1000 bugs related to ORA-03113 but I found one that may apply:

Bug 5015257: QUERY FAILS WITH ORA-3113 AND COREDUMP WHEN QUERY_REWRITE_ENABLED=’TRUE’

To summarize:

  • Identified in 9.2.0.6.0 and fixed in 10.2.0.1
  • Running a particular query
    (not identified) causes ORA-03113
  • Running explain on query does the
    same
  • There is a core file in
    $ORACLE_HOME/dbs
  • Workaround is to set
    QUERY_REWRITE_ENABLED to false: alter
    system set query_rewrite_enabled =
    FALSE;

Another possibility:

Bug 3659827: ORA-3113 FROM LONG RUNNING QUERY

  • 9.2.0.5.0 through 10.2.0.0
  • Problem: Customer has long running query that consistently produces ORA-3113 errros.
    On customers system they receive core.log files but do not receive any errors
    in the alert.log. On test system I used I receivded ORA-7445 errors.
  • Workaround: set «_complex_view_merging»=false at session level or instance level.

answered Aug 13, 2010 at 1:10

jlpp's user avatar

jlppjlpp

1,5645 gold badges23 silver badges36 bronze badges

From the information so far it looks like an back-end crash, as Dave Costa suggested some time ago. Were you able to check the server logs?

Can you get the plan with set autotrace traceonly explain? Does it happen from SQL*Plus locally, or only with a remote connection? Certainly sounds like an ORA-600 on the back-end could be the culprit, particularly if it’s at parse time. The successful run taking longer than the failing one seems to rule out a network problem. I suspect it’s failing quite quickly but the client is taking up to 30 seconds to give up on the dead connection, or the server is taking that long to write trace and core files.

Which probably leaves you the option of patching (if you can find a relevant fix for the specific ORA-600 on Metalink) or upgrading the DB; or rewriting the query to avoid it. You may get some ideas for how to do that from Metalink if it’s a known bug. If you’re lucky it might be as simple as a hint, if the extra condition is having an unexpected impact on the plan. Is someMultiJoin.someColumn part of an index that’s used in the successful version? It’s possible the UPPER is confusing it and you could persuade it back on to the successful plan by hinting it to use the index anyway, but that’s obviously rather speculative.

answered Aug 10, 2010 at 11:21

Alex Poole's user avatar

Alex PooleAlex Poole

183k11 gold badges178 silver badges315 bronze badges

0

It means you have been disconnected. This not likely to be due to being a resource hog.

I have seen where the connection to the DB is running over a NAT and because there is no traffic it closes the tunnel and thus drops the connection. Generally if you use connection pooling you won’t get this.

answered Jul 28, 2010 at 7:31

Daniel's user avatar

1

As @Daniel said, the network connection to the server is being broken. You might take a look at End-of-file on communication channel to see if it offers any useful suggestions.

Share and enjoy.

Community's user avatar

answered Jul 28, 2010 at 11:27

Bob Jarvis - Слава Україні's user avatar

This is often a bug in the Cost Based Optimizer with complex queries.

What you can try to do is to change the execution plan. E.g. use WITH to pull some subquerys out. Or use the SELECT /*+ RULE */ hint to prevent Oracle from using the CBO. Also dropping the statistics helps, because Oracle then uses another execution plan.

If you can update the database, make a test installation of 9.2.0.8 and see if the error is gone there.

Sometimes it helps to make a dump of the schema, drop everything in it and import the dump again.

answered Aug 6, 2010 at 6:45

andrem's user avatar

andremandrem

4012 silver badges4 bronze badges

I was having the same error, in my case what was causing it was the length of the query.

By reducing said length, I had no more problems.

answered Sep 2, 2022 at 12:10

Alexander Martins's user avatar

After hours of misdirection from official Oracle support, I dove into this on my own and fixed it. I am documenting it here in case someone else has this problem.

To do any of this, you must be the oracle user:

$ su - oracle

Step 1: You need to look at the alert log. It isn’t in /var/log as expected. You have to run an Oracle log reading program:

$ adrci
ADRCI: Release 11.2.0.1.0 - Production on Wed Sep 11 18:27:56 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
ADR base = "/u01/app/oracle"
adrci>

Notice the ADR base. That is not the install. You need to see the homes so you can connect to the one that you use.

adrci> show homes
ADR Homes:
diag/rdbms/cci/CCI
diag/tnslsnr/cci/listener
diag/tnslsnr/cci/start
diag/tnslsnr/cci/reload

CCI is the home. Set that.

adrci> set home diag/rdbms/cci/CCI
adrci>

Now, you can look at the alert logs. It would be very nice if they were in /var/log so you could easily parse the logs. Just stop wanting and deal with this interface. At least you can tail (and I hope you have a scrollback buffer):

adrci> show alert -tail 100

Scroll back until you see errors. You want the FIRST error. Any errors after the first error are likely being caused by the first error. In my case, the first error was:

ORA-19815: WARNING: db_recovery_file_dest_size of 53687091200 bytes is 100.00% used, and has 0 remaining bytes available.

This is caused by transactions. Oracle is not designed to be used. If you do push a lot of data into it, it saves transaction logs. Those go into the recovery file area. Once that is full (50GB full in this case). Then, Oracle just dies. By design, if anything is messed up, Oracle will respond by shutting down.

There are two solutions, the proper one and the quick and dirty one. The quick and dirty one is to increase db_recovery_file_dest_size. First, exit adrci.

adrci> exit

Now, go into sqlplus without opening the database, just mounting it (you may be able to do this without mounting the database, but I mount it anyway).

$ sqlplus /nolog
SQL*Plus: Release 11.2.0.1.0 Production on Wed Sep 11 18:40:25 2013
Copyright (c) 1982, 2009, Oracle. All rights reserved.
SQL> connect / as sysdba
Connected.
SQL> startup mount

Now, you can increase your current db_recovery_file_dest_size, increased to 75G in my case:

SQL> alter system set db_recovery_file_dest_size = 75G scope=both

Now, you can shutdown and startup again and that previous error should be gone.

The proper fix is to get rid of the recovery files. You do that using RMAN, not SQLPLUS or ADRCI.

$ rman
Recovery Manager: Release 11.2.0.1.0 - Production on Wed Sep 11 18:45:11 2013
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
RMAN> backup archivelog all delete input;

If you’ve got RMAN-06171: not connected to target database, than try to use rman target / instead of just rman

Wait a long time and your archivelog (that was using up all that space) will be gone. So, you can shutdown/startup your database and be back in business.

I have a 400 line sql query which is throwing exception withing 30 seconds

ORA-03113: end-of-file on communication channel

Below are things to note:

  1. I have set the timeout as 10 mins
  2. There is one last condition when removed resolves this error.
  3. This error came only recently when I analyzed indexes.

The troubling condition is like this:

AND UPPER (someMultiJoin.someColumn) LIKE UPPER ('%90936%')

So my assumption is that the query is getting terminated from the server side apparently because its identified as a resource hog.

Is my assumption appropriate ? How should I go about to fix this problem ?

EDIT: I tried to get the explain plan of faulty query but the explain plan query also gives me an ORA-03113 error. I understand that my query is not very performant but why should that be a reason for ORA-03113 error. I am trying to run the query from toad and there are no alert log or trace generated, my db version is
Oracle9i Enterprise Edition Release 9.2.0.7.0 — Production

asked Jul 28, 2010 at 7:01

Ravi Gupta's user avatar

Ravi GuptaRavi Gupta

4,45812 gold badges54 silver badges85 bronze badges

4

One possible cause of this error is a thread crash on the server side. Check whether the Oracle server has generated any trace files, or logged any errors in its alert log.

You say that removing one condition from the query causes the problem to go away. How long does the query take to run without that condition? Have you checked the execution plans for both versions of the query to see if adding that condition is causing some inefficient plan to be chosen?

answered Jul 28, 2010 at 12:32

Dave Costa's user avatar

Dave CostaDave Costa

47.1k8 gold badges56 silver badges72 bronze badges

2

You can safely remove the «UPPER» on both parts if you are using the like with numbers (that are not case sensitive), this can reduce the query time to check the like sentence

AND UPPER (someMultiJoin.someColumn) LIKE UPPER ('%90936%')

Is equals to:

AND someMultiJoin.someColumn LIKE '%90936%'

Numbers are not affected by UPPER (and % is independent of character casing).

answered Aug 11, 2010 at 15:50

Dubas's user avatar

DubasDubas

2,8451 gold badge24 silver badges37 bronze badges

I’ve had similar connection dropping issues with certain variations on a query. In my case connections dropped when using rownum under certain circumstances. It turned out to be a bug that had a workaround by adjusting a certain Oracle Database configuration setting. We went with a workaround until a patch could be installed. I wish I could remember more specifics or find an old email on this but I don’t know that the specifics would help address your issue. I’m posting this just to say that you’ve probably encountered a bug and if you have access to Oracle’s support site (support.oracle.com) you’ll likely find that others have reported it.

Edit:
I had a quick look at Oracle support. There are more than 1000 bugs related to ORA-03113 but I found one that may apply:

Bug 5015257: QUERY FAILS WITH ORA-3113 AND COREDUMP WHEN QUERY_REWRITE_ENABLED=’TRUE’

To summarize:

  • Identified in 9.2.0.6.0 and fixed in 10.2.0.1
  • Running a particular query
    (not identified) causes ORA-03113
  • Running explain on query does the
    same
  • There is a core file in
    $ORACLE_HOME/dbs
  • Workaround is to set
    QUERY_REWRITE_ENABLED to false: alter
    system set query_rewrite_enabled =
    FALSE;

Another possibility:

Bug 3659827: ORA-3113 FROM LONG RUNNING QUERY

  • 9.2.0.5.0 through 10.2.0.0
  • Problem: Customer has long running query that consistently produces ORA-3113 errros.
    On customers system they receive core.log files but do not receive any errors
    in the alert.log. On test system I used I receivded ORA-7445 errors.
  • Workaround: set «_complex_view_merging»=false at session level or instance level.

answered Aug 13, 2010 at 1:10

jlpp's user avatar

jlppjlpp

1,5645 gold badges23 silver badges36 bronze badges

From the information so far it looks like an back-end crash, as Dave Costa suggested some time ago. Were you able to check the server logs?

Can you get the plan with set autotrace traceonly explain? Does it happen from SQL*Plus locally, or only with a remote connection? Certainly sounds like an ORA-600 on the back-end could be the culprit, particularly if it’s at parse time. The successful run taking longer than the failing one seems to rule out a network problem. I suspect it’s failing quite quickly but the client is taking up to 30 seconds to give up on the dead connection, or the server is taking that long to write trace and core files.

Which probably leaves you the option of patching (if you can find a relevant fix for the specific ORA-600 on Metalink) or upgrading the DB; or rewriting the query to avoid it. You may get some ideas for how to do that from Metalink if it’s a known bug. If you’re lucky it might be as simple as a hint, if the extra condition is having an unexpected impact on the plan. Is someMultiJoin.someColumn part of an index that’s used in the successful version? It’s possible the UPPER is confusing it and you could persuade it back on to the successful plan by hinting it to use the index anyway, but that’s obviously rather speculative.

answered Aug 10, 2010 at 11:21

Alex Poole's user avatar

Alex PooleAlex Poole

183k11 gold badges178 silver badges315 bronze badges

0

It means you have been disconnected. This not likely to be due to being a resource hog.

I have seen where the connection to the DB is running over a NAT and because there is no traffic it closes the tunnel and thus drops the connection. Generally if you use connection pooling you won’t get this.

answered Jul 28, 2010 at 7:31

Daniel's user avatar

1

As @Daniel said, the network connection to the server is being broken. You might take a look at End-of-file on communication channel to see if it offers any useful suggestions.

Share and enjoy.

Community's user avatar

answered Jul 28, 2010 at 11:27

Bob Jarvis - Слава Україні's user avatar

This is often a bug in the Cost Based Optimizer with complex queries.

What you can try to do is to change the execution plan. E.g. use WITH to pull some subquerys out. Or use the SELECT /*+ RULE */ hint to prevent Oracle from using the CBO. Also dropping the statistics helps, because Oracle then uses another execution plan.

If you can update the database, make a test installation of 9.2.0.8 and see if the error is gone there.

Sometimes it helps to make a dump of the schema, drop everything in it and import the dump again.

answered Aug 6, 2010 at 6:45

andrem's user avatar

andremandrem

4012 silver badges4 bronze badges

I was having the same error, in my case what was causing it was the length of the query.

By reducing said length, I had no more problems.

answered Sep 2, 2022 at 12:10

Alexander Martins's user avatar

Имеем тестовую СУБД Oracle XE (11.2.0.1.x86_64), на Oracle Linux 5.2 x86_64.
При попытке открыть БД из sqlplus получаем ошибку:

ORA-03113: end-of-file on communication channel
Process ID: 4862
Session ID: 91 Serial number: 3

Гугл намекает, что возможны какие-либо проблемы со свободным местом. Смотрим, что с ним:
$ df -h

Видим, что на уровне операционной системы все в порядке.
Смотрим, что в alert.log
$ view /u01/app/oracle/diag/rdbms/xe/XE/trace/alert_XE.log

Видим такую запись после попытки открыть БД:
ORA-19815: WARNING: db_recovery_file_dest_size of 10737418240 bytes is 100.00% used, and has 0 remaining bytes available.
************************************************************************
You have following choices to free up space from recovery area:
1. Consider changing RMAN RETENTION POLICY. If you are using Data Guard,
   then consider changing RMAN ARCHIVELOG DELETION POLICY.
2. Back up files to tertiary device such as tape using RMAN
   BACKUP RECOVERY AREA command.
3. Add disk space and increase db_recovery_file_dest_size parameter to
   reflect the new space.
4. Delete unnecessary files using RMAN DELETE command. If an operating
   system command was used to delete files, then use RMAN CROSSCHECK and
   DELETE EXPIRED commands.
************************************************************************
ARCH: Error 19809 Creating archive log file to ‘/u01/app/oracle/fast_recovery_area/XE/archivelog/2015_04_21/o1_mf_1_243_%u_.arc’
Errors in file /u01/app/oracle/diag/rdbms/xe/XE/trace/XE_ora_5607.trc:

Т.е. вся область восстановления чем-то забита.
Попутно понятно, куда пишутся archivelog:
/u01/app/oracle/fast_recovery_area/XE/archivelog/

Смотрим, чем забита fast recovery area:
du -sh /u01/app/oracle/fast_recovery_area/XE/*
10G    /u01/app/oracle/fast_recovery_area/XE/archivelog

Видим, что 10Гб «съели» archivelog. Надо почистить.
Запускаем Recovery manager:
$ rman TARGET /
RMAN> list backup;

RMAN-03002: failure of list command at 04/21/2015 14:47:28
ORA-01507: database not mounted

БД должна быть примонтирована, чтобы RMAN показал бэкапы.
Монтируем:
$ sqlplus / as sysdba
SQL> alter database mount;

Database altered.

Смотрим бэкапы:
$ rman TARGET /
RMAN> list backup;

specification does not match any backup in the repository

Нет ни одного бэкапа. Что само по себе печально, но для тестовой БД устраивает.

Зато видим порядка 300 архивных логов:
$ rman TARGET /
RMAN> list archivelog all;

Чтобы исправить ситуацию, временно увеличиваем db_recovery_file_dest_size, создаем бэкап и удаляем ненужные архивные логи.
$ sqlplus / as sysdba
SQL> alter system set db_recovery_file_dest_size=12G;

System altered.

Делаем бэкап БД, controlfile и архивных логов.
$ rman TARGET /
RMAN> backup as compressed backupset database plus archivelog delete input;

Возвращаем обратно значение параметра инициализации db_recovery_file_dest_size:
$ sqlplus / as sysdba
SQL> alter system set db_recovery_file_dest_size=10G;

System altered.

It works!

Oerr Utility output:
Cause: The connection between Client and Server process was broken.
Action: There was a communication error that requires further investigation.

Check the alert_sid.log file on the server. This may be an indication that the communications link may have gone down at least temporarily, or it may indicate that the server has gone down.

When we are going to start the Oracle database, i am getting the ORA-03113 error during the startup command. My first step is looking into the alert log file which helps us to find cause of error.

In Research found following reasons of error:
Reason 1:
If you working on one SQLPLUS session and from another session DBA shutdown the database then on executing query from your session will give the error as follows:
SQL> select * from dual;
select * from dual
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 10308
Session ID: 9 Serial number: 3

Solution 1:
Reconnect with SQLPLUS and run the command again.

Reason 2:
If you are going to start the database and your Recovery file destination is full then you will also get the following error on db_recovery_file_dest is full.
SQL> startup
ORACLE instance started.
Total System Global Area 23584982528 bytes
Fixed Size 2452778 bytes
Variable Size 4531678966 bytes
Database Buffers 2342356778 bytes
Redo Buffers 25876431 bytes
Database mounted.
ORA-03113: end-of-file on communication channel
Process ID: 2588
Session ID: 1705 Serial number: 5

Solution 2:
1. Checked the alert log find, we find the db_recovery_file_dest is full and alert log is giving following warning:
ORA-19815: WARNING: db_recovery_file_dest_size of 2456687415514 bytes is 100.00% used, and has 0 remaining bytes available.
2. Open the Database in mount state
startup mount
3. Check and increase the parameter current value:
Show parameter db_recovery_file_dest_size
-- add 10GB size more to this parameter
alter system set db_recovery_file_dest_size = 75G scope=both;

4. Open the Database.
alter database open;
5. Fixed the issue with RMAN.
RMAN> backup archivelog all delete input;

Reasons 3:
Redo log file seems inactive or corrupted.
Solution 3:
1. Startup the instance in nomount:
SQL> startup nomount
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size 2926472 bytes
Variable Size 1224738936 bytes
Database Buffers 905969664 bytes
Redo Buffers 13848576 bytes

2. Open database into mount state:
alter database mount;
Database altered.

3. Clear the redo log files having issue due to power failure or unclean shutdown of database.
SQL> alter database clear unarchived logfile group 1;
Database altered.
SQL> alter database clear unarchived logfile group 2;
Database altered.
SQL> alter database clear unarchived logfile group 3;
Database altered.

4. Shutdown the database and open it.
SQL> shutdown immediate
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.
Total System Global Area 2147483648 bytes
Fixed Size 2926472 bytes
Variable Size 1224738936 bytes
Database Buffers 905969664 bytes
Redo Buffers 13848576 bytes
Database mounted.
Database opened.

  • Ошибка ora 02291 integrity constraint
  • Ошибка ora 01652 невозможно увеличить временный сегмент
  • Ошибка ora 01438 value larger than specified precision allowed for this column
  • Ошибка ora 01422 точная выборка возвращает количество строк больше запрошенного
  • Ошибка ora 01033 oracle initialization or shutdown in progress