From owner-numa@oss.sgi.com Fri Apr 6 02:32:58 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.3/8.11.3) id f369WwG21502 for numa-outgoing; Fri, 6 Apr 2001 02:32:58 -0700 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by oss.sgi.com (8.11.3/8.11.3) with ESMTP id f369WkM21498 for ; Fri, 6 Apr 2001 02:32:47 -0700 Received: from zeta.dmz-eu.st.com (zeta.dmz-eu.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with SMTP id 90A684B43 for ; Fri, 6 Apr 2001 09:32:18 +0000 (GMT) Received: by zeta.dmz-eu.st.com (STMicroelectronics, from userid 0) id C29D24AB8; Fri, 6 Apr 2001 09:32:39 +0000 (GMT) Received: from thistle.bri.st.com (localhost [127.0.0.1]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 5F244184E for ; Fri, 6 Apr 2001 09:32:39 +0000 (GMT) Received: from [164.130.130.40] (helo=MCDD40) by thistle.bristol.st.com with smtp (Exim 3.03 #5) id 14lSbI-0002uw-00 for numa@oss.sgi.com; Fri, 06 Apr 2001 10:32:17 +0100 Received: from rmchost.bristol.st.com [164.130.130.51] by MCDD40 with esmtp (Exim 1.62 #1) id 14lSWj-0000Wb-00; Fri, 6 Apr 2001 11:27:33 +0200 Message-ID: <3ACD8BF1.38E5B9E1@rmchost.bristol.st.com> Date: Fri, 06 Apr 2001 11:27:13 +0200 From: Fabrizio Sensini Organization: STMicroelectronics X-Mailer: Mozilla 4.76 [en] (X11; U; SunOS 5.7 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: numa@oss.sgi.com Subject: Discontiguous memory support for SuperH Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-numa@oss.sgi.com Precedence: bulk Hi there, I am currently doing some work to enable discontiguous memory support on a board hosting a Hitachi SuperH processor (SH4 7750). The rationale for such an activity is twofold: - the board I am using has two memory blocks (32Mb cached SDRAM and 16Mb uncached PCI DRAM) of which (currently) only the first one has been made available to Linux: I need to use the DRAM to allow DMA transfer from/to PCI devices via pci_alloc_consistent() and 16 more MB of memory should not (in principle) do any harm; - it is a good learning exercise: please consider that I am currently in a training phase, so bear with me if I talk nonsense every so often. My idea was to use discontiguous memory support for such purpose, by declaring the two memory blocks as distinct nodes, one of which totally DMAble and the other normal. This way any allocation with the GFP_DMA flag set would go to the PCI DRAM, including the one done by the most common implementation of pci_alloc_consistent(). I am currently using kernel version 2.4.1-st0.1 (which is essentially 2.4.1 with some arch-dependent changes by STMicroelectronics - my company, I believe). So far I have changed the architecture-dependent source code as follows: - introduced discontiguous memory support in configuration info; - declared 2 nodes, each one linked with a distinct bootmem allocator whose map is located in the node itself; - I am using a single mem_map for both, not starting at __PAGE_OFFSET and containing the page descriptors for both blocks, one after the other; - I have changed a few conversion macros to make linux aware of the existence of the second memory block: __pa(), __va(), virt_to_page(), pte_page(), mk_pte(). In addition I have applied palloc.patch (downloaded from the numa project). This because (being the memory chunks different in size), as soon as one of them is full, alloc_pages() gets stuck. My current settings are demanding in terms of memory bacause I am using a ramdisk as bootfs. The patch appears to solve my problem: at least linux completes the boot phase. The problem I am facing now is that (as soon as init starts) execution is slowed down by an endless stream of page faults of a factor of at least 100. Could anyone give me a one-line hint on what to focus on? Right now I have kind of an impression that (since palloc.patch is quite recent and one of the two memory blocks gets filled) there could still be problems related to alloc_pages(), even though the odds are I am overlooking something. Another idea could be that there is sth missing in release 2.4.1 and not in 2.4.3, which could solve my problem. Thanks in advance for any suggestions you can give. P.S. My problems magically disappear if in alloc_pages, the search always starts from the first node (no wonder). I am doing this since the first memory block is the fastest. I suppose that in future this will be probably included in the kernel (in ways still to define). -- ***M*C*D*D***S*T*A*R***D*e*s*i*g*n****C*e*n*t*r*e***i*n***C*a*t*a*n*i*a*** Fabrizio Sensini - MCDD APU team member - STMicroelectronics Strada Ottava - Zona Industriale - 95121 Catania - Italia tel: +39 095 740 4556 fax: +39 095 740 4008 Mailto:Fabrizio.Sensini@st.com From owner-numa@oss.sgi.com Fri Apr 6 11:19:41 2001 Received: (from majordomo@localhost) by oss.sgi.com (8.11.3/8.11.3) id f36IJfJ03711 for numa-outgoing; Fri, 6 Apr 2001 11:19:41 -0700 Received: from deliverator.sgi.com (deliverator.sgi.com [204.94.214.10]) by oss.sgi.com (8.11.3/8.11.3) with ESMTP id f36IJeM03708 for ; Fri, 6 Apr 2001 11:19:40 -0700 Received: from google.engr.sgi.com (google.engr.sgi.com [163.154.10.145]) by deliverator.sgi.com (980309.SGI.8.8.8-aspam-6.2/980310.SGI-aspam) via ESMTP id LAA18842 for ; Fri, 6 Apr 2001 11:18:25 -0700 (PDT) mail_from (kanoj@google.engr.sgi.com) Received: (from kanoj@localhost) by google.engr.sgi.com (SGI-8.9.3/8.9.3) id LAA61510 for numa@oss.sgi.com; Fri, 6 Apr 2001 11:19:23 -0700 (PDT) From: Kanoj Sarcar Message-Id: <200104061819.LAA61510@google.engr.sgi.com> Subject: Re: Failed mail (fwd) To: numa@oss.sgi.com Date: Fri, 6 Apr 2001 11:19:23 -0700 (PDT) X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-numa@oss.sgi.com Precedence: bulk From kanoj@google.engr.sgi.com Fri Apr 6 11:12:19 2001 Received: from cthulhu.engr.sgi.com (gate3-relay.engr.sgi.com [130.62.1.234]) by google.engr.sgi.com (SGI-8.9.3/8.9.3) with ESMTP id LAA32235 for ; Fri, 6 Apr 2001 11:12:18 -0700 (PDT) Received: from sgi.com (sgi.engr.sgi.com [192.26.80.37]) by cthulhu.engr.sgi.com (SGI-8.9.3/8.9.3) with ESMTP id LAA67403 for ; Fri, 6 Apr 2001 11:11:20 -0700 (PDT) Received: from oss.sgi.com (oss.sgi.com [216.32.174.190]) by sgi.com (980327.SGI.8.8.8-aspam/980304.SGI-aspam: SGI does not authorize the use of its proprietary systems or networks for unsolicited or bulk email from the Internet.) via ESMTP id LAA05989 for ; Fri, 6 Apr 2001 11:11:13 -0700 (PDT) mail_from (kanoj@google.engr.sgi.com) Received: from yog-sothoth.sgi.com (eugate.sgi.com [192.48.160.10]) by oss.sgi.com (8.11.3/8.11.3) with ESMTP id f36IBBM03570 for ; Fri, 6 Apr 2001 11:11:11 -0700 Received: from google.engr.sgi.com (google.engr.sgi.com [163.154.10.145]) by yog-sothoth.sgi.com (980305.SGI.8.8.8-aspam-6.2/980304.SGI-aspam-europe) via ESMTP id UAA7256020 for ; Fri, 6 Apr 2001 20:11:09 +0200 (CEST) mail_from (kanoj@google.engr.sgi.com) Received: (from kanoj@localhost) by google.engr.sgi.com (SGI-8.9.3/8.9.3) id LAA40399; Fri, 6 Apr 2001 11:09:48 -0700 (PDT) From: Kanoj Sarcar Message-Id: <200104061809.LAA40399@google.engr.sgi.com> Subject: Re: Failed mail To: mmdf@sco.COM (MMDF Mail System) Date: Fri, 6 Apr 2001 11:09:47 -0700 (PDT) Cc: owner-numa@oss.sgi.com In-Reply-To: <200104060235.aa10628@sco.sco.COM> from "MMDF Mail System" at Apr 06, 2001 02:35:14 AM X-Mailer: ELM [version 2.5 PL2] MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit > > Hi there, > > I am currently doing some work to enable discontiguous memory support on > > a board hosting a Hitachi SuperH processor (SH4 7750). > The rationale for such an activity is twofold: > - the board I am using has two memory blocks (32Mb cached SDRAM and > 16Mb uncached PCI DRAM) of which (currently) only the first one has been > Just to get things clear, are these two pieces of memory physically discontiguous in physical address space? Can both be DMA'ed into/outof? Can both be written/read by software? Basically, I am not sure what you mean by 16Mb uncached PCI ... > made available to Linux: I need to use the DRAM to allow DMA transfer > from/to PCI devices > via pci_alloc_consistent() and 16 more MB of memory should not (in > principle) > do any harm; > - it is a good learning exercise: please consider that I am currently in > a training phase, so bear with > me if I talk nonsense every so often. > > My idea was to use discontiguous memory support for such purpose, > by declaring the two memory blocks as distinct nodes, one of which > totally > DMAble and the other normal. This way any allocation with the GFP_DMA > flag set would go to the PCI DRAM, including the one done by the most > common implementation > of pci_alloc_consistent(). > > I am currently using kernel version 2.4.1-st0.1 (which is essentially > 2.4.1 with some arch-dependent > changes by STMicroelectronics - my company, I believe). > > So far I have changed the architecture-dependent source code as follows: > > - introduced discontiguous memory support in configuration info; > - declared 2 nodes, each one linked with a distinct bootmem allocator > whose map > is located in the node itself; > - I am using a single mem_map for both, not starting at __PAGE_OFFSET > and containing the page descriptors for both blocks, > one after the other; If you are using a single mem_map, you do not need to put bootmems into the two nodes, I would think. > - I have changed a few conversion macros to make linux aware of the > existence of the second memory > block: __pa(), __va(), virt_to_page(), pte_page(), mk_pte(). If you must have 2 nodes, each should have its own bootmem/memmap. In which case, you should be able to use DISCONTIGMEM support as it exists on mips64/arm. IE, you can use definitions for these macros similar to what these architectures do. If you have a single memmap, I think you should also have a single bootmem, and handle both pieces of memory that way. Just make sure to mark all the memmaps that represent nonexistant memory between the 32Mb and 16Mb extants as reserved and not available to software (in paging_init). Kanoj > > In addition I have applied palloc.patch (downloaded from the numa > project). This because (being the memory > chunks different in size), as soon as one of them is full, alloc_pages() > gets stuck. My current settings > are demanding in terms of memory bacause I am using a ramdisk as bootfs. > > The patch appears to solve my problem: at least linux completes the boot > phase. > The problem I am facing now is that (as soon as init starts) execution > is slowed down by > an endless stream of page faults of a factor of at least 100. > > Could anyone give me a one-line hint on what to focus on? > > Right now I have kind of an impression that (since palloc.patch is quite > recent and one of the two memory blocks gets filled) > there could still be problems related to alloc_pages(), even though the > odds are I am overlooking something. > Another idea could be that there is sth missing in release 2.4.1 and not > in 2.4.3, which could solve my > problem. > > Thanks in advance for any suggestions you can give. > > P.S. My problems magically disappear if in alloc_pages, the search > always starts from the first node (no wonder). > I am doing this since the first memory block is the fastest. I suppose > that in future this will be probably included in the kernel > (in ways still to define). > > -- > ***M*C*D*D***S*T*A*R***D*e*s*i*g*n****C*e*n*t*r*e***i*n***C*a*t*a*n*i*a*** > Fabrizio Sensini - MCDD APU team member - STMicroelectronics > Strada Ottava - Zona Industriale - 95121 Catania - Italia > tel: +39 095 740 4556 fax: +39 095 740 4008 Mailto:Fabrizio.Sensini@st.com > > > >