Hi there,
I am currently doing some work to enable discontiguous memory support on
a board hosting a Hitachi SuperH processor (SH4 7750).
The rationale for such an activity is twofold:
- the board I am using has two memory blocks (32Mb cached SDRAM and
16Mb uncached PCI DRAM) of which (currently) only the first one has been
made available to Linux: I need to use the DRAM to allow DMA transfer
from/to PCI devices
via pci_alloc_consistent() and 16 more MB of memory should not (in
principle)
do any harm;
- it is a good learning exercise: please consider that I am currently in
a training phase, so bear with
me if I talk nonsense every so often.
My idea was to use discontiguous memory support for such purpose,
by declaring the two memory blocks as distinct nodes, one of which
totally
DMAble and the other normal. This way any allocation with the GFP_DMA
flag set would go to the PCI DRAM, including the one done by the most
common implementation
of pci_alloc_consistent().
I am currently using kernel version 2.4.1-st0.1 (which is essentially
2.4.1 with some arch-dependent
changes by STMicroelectronics - my company, I believe).
So far I have changed the architecture-dependent source code as follows:
- introduced discontiguous memory support in configuration info;
- declared 2 nodes, each one linked with a distinct bootmem allocator
whose map
is located in the node itself;
- I am using a single mem_map for both, not starting at __PAGE_OFFSET
and containing the page descriptors for both blocks,
one after the other;
- I have changed a few conversion macros to make linux aware of the
existence of the second memory
block: __pa(), __va(), virt_to_page(), pte_page(), mk_pte().
In addition I have applied palloc.patch (downloaded from the numa
project). This because (being the memory
chunks different in size), as soon as one of them is full, alloc_pages()
gets stuck. My current settings
are demanding in terms of memory bacause I am using a ramdisk as bootfs.
The patch appears to solve my problem: at least linux completes the boot
phase.
The problem I am facing now is that (as soon as init starts) execution
is slowed down by
an endless stream of page faults of a factor of at least 100.
Could anyone give me a one-line hint on what to focus on?
Right now I have kind of an impression that (since palloc.patch is quite
recent and one of the two memory blocks gets filled)
there could still be problems related to alloc_pages(), even though the
odds are I am overlooking something.
Another idea could be that there is sth missing in release 2.4.1 and not
in 2.4.3, which could solve my
problem.
Thanks in advance for any suggestions you can give.
P.S. My problems magically disappear if in alloc_pages, the search
always starts from the first node (no wonder).
I am doing this since the first memory block is the fastest. I suppose
that in future this will be probably included in the kernel
(in ways still to define).
--
***M*C*D*D***S*T*A*R***D*e*s*i*g*n****C*e*n*t*r*e***i*n***C*a*t*a*n*i*a***
Fabrizio Sensini - MCDD APU team member - STMicroelectronics
Strada Ottava - Zona Industriale - 95121 Catania - Italia
tel: +39 095 740 4556 fax: +39 095 740 4008 Mailto:Fabrizio.Sensini@xxxxxx
|