Again: Linux 2.3.31/36 bug using large memory

From: Nagy Tibor (nagyt@otpbank.hu)
Date: Thu Jan 06 2000 - 03:59:15 EST


Hi,

I sent the attached bug report appeared with Linux kernel 2.3.31. The
problem still exists with 2.3.36 also.

I note that I tried to set the memory to 2047M and then 1024M at boot
time, the
situation was still the same. After consuming about 700-800M the system
crashed. The amount of used memory did not reach even 1MByte!!!

I must note, that the 2.2.1[123] kernel hes been working well with 2GB
memory for a long time.

Tibor

------------------------------------------------------------------------
Tibor Nagy
Card and Core Banking Systems Development Project
National Savings and Commercial Bank Ltd (OTP Bank)
H-1051 Budapest Nador u. 16.
Tel: 00 36 1 374 6990 Fax: 00 36 1 374 6981 E-mail: nagyt@otpbank.hu
------------------------------------------------------------------------

attached mail follows:


See attached problem report.

------------------------------------------------------------------------
Tibor Nagy
Director of Card and Core Banking Systems Development Project
National Savings and Commercial Bank Ltd (OTP Bank)
H-1051 Budapest Nador u. 16.
Tel: 00 36 1 374 6990 Fax: 00 36 1 374 6981 E-mail: nagyt@otpbank.hu
------------------------------------------------------------------------

linux-kernel@vger.rutgers.edu

[1.] One line summary of the problem:

The Linux kernel crashes using a lot of of memory.

[2.] Full description of the problem/report:

On a Dell PowerEdge 6300 (4 CPU Xeon 550 MHz 4 GB RAM) the kernel is not
able to allocate more memory after passing a certain amonunt of
memory.
This results, that no more processes can be started:
         fork: Cannot allocate memory.

[3.] Keywords (i.e., modules, networking, kernel):

kernel, 4GB, memory management

[4.] Kernel version (from /proc/version):

Linux version 2.3.31 (root@dell632) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #6 SMP Thu Dec 9 09:00:39 CET
1999

[5.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/oops-tracing.txt)

-

[6.] A small shell script or example program which triggers the
     problem (if possible)

The problem is reproducable very simple: dd if=<a disk> of=/dev/null.
bs=10240k
The memory buffer reaches very fast 2 GB.

[7.] Environment

On a Dell PowerEdge 6300 (4 CPU Xeon 550 MHz 4 GB RAM) the kernel is not
able to allocate more memory after passing the 2 GB used memory.

[7.1.] Software (add the output of the ver_linux script here)
[7.2.] Processor information (from /proc/cpuinfo):

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 550.037362
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx osfxsr kni
bogomips : 548.86

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 550.037362
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx osfxsr kni
bogomips : 548.86

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 550.037362
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx osfxsr kni
bogomips : 548.86

processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 550.037362
cache size : 1024 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx osfxsr kni
bogomips : 548.86

[7.3.] Module information (from /proc/modules):

eepro100 14744 2

[7.4.] SCSI information (from /proc/scsi/scsi)

Attached devices:
Host: scsi0 Channel: 00 Id: 04 Lun: 00
  Vendor: PLEXTOR Model: CD-R PX-R412C Rev: 1.03
  Type: CD-ROM ANSI SCSI revision: 02
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: QUANTUM Model: DLT7000 Rev: 2150
  Type: Sequential-Access ANSI SCSI revision: 02
Host: scsi1 Channel: 00 Id: 05 Lun: 00
  Vendor: NEC Model: CD-ROM DRIVE:466 Rev: 1.06
  Type: CD-ROM ANSI SCSI revision: 02
Host: scsi2 Channel: 00 Id: 06 Lun: 00
  Vendor: DELL Model: 1x6 U2W SCSI BP Rev: 5.23
  Type: Processor ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 00 Lun: 00
  Vendor: MegaRAID Model: LD0 RAID0 17278R Rev: 3.00
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 00 Lun: 01
  Vendor: MegaRAID Model: LD1 RAID0 17278R Rev: 3.00
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 00 Lun: 02
  Vendor: MegaRAID Model: LD2 RAID0 17278R Rev: 3.00
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 00 Lun: 03
  Vendor: MegaRAID Model: LD3 RAID0 17278R Rev: 3.00
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 00 Lun: 04
  Vendor: MegaRAID Model: LD4 RAID0 17278R Rev: 3.00
  Type: Direct-Access ANSI SCSI revision: 02
Host: scsi2 Channel: 01 Id: 00 Lun: 05
  Vendor: MegaRAID Model: LD5 RAID0 17278R Rev: 3.00
  Type: Direct-Access ANSI SCSI revision: 02

[7.5.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):

The output of the vmstat at crashing:

 0 1 0 0 3273564 634264 30204 0 0 19062 0 409 610 0 4 96
 0 1 0 0 3252320 653080 30204 0 0 18816 0 405 612 0 4 96
 0 1 0 0 3230800 672140 30204 0 0 19062 0 403 610 0 9 91
 0 1 0 0 3209480 691024 30204 0 0 18882 0 402 611 0 7 93
 0 1 0 0 3188096 709964 30204 0 0 19004 0 405 612 0 7 93
 0 1 0 0 3166644 728964 30204 0 0 18938 0 405 611 0 8 92
/proc: Cannot allocate memory

After that the console is working, but no more commands can be
startet because of the fork error (Cannot allocate memory). The
built-in shell commands (kill, echo, etc...) are still working.

[X.] Other notes, patches, fixes, workarounds:

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jan 07 2000 - 21:00:05 EST