xfs
[Top] [All Lists]

Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads

To: Sam Vaughan <sjv@xxxxxxx>
Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads
From: Stewart Smith <stewart@xxxxxxxxx>
Date: Mon, 13 Nov 2006 15:09:02 +1100
Cc: xfs@xxxxxxxxxxx
In-reply-to: <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@xxxxxxx>
Organization: MySQL AB
References: <1163381602.11914.10.camel@xxxxxxxxxxxxxxxxxxxxx> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@xxxxxxx>
Sender: xfs-bounce@xxxxxxxxxxx
On Mon, 2006-11-13 at 13:58 +1100, Sam Vaughan wrote:
> Are the two processes in your test writing files to the same  
> directory as each other?  If so then their allocations will go into  
> the same AG as the directory by default, hence the fragmentation.  If  
> you can limit yourself to an AG's worth of data per directory then  
> you should be able to avoid fragmentation using the default  
> allocator.  If you need to reserve more than that per AG, then the  
> files will most likely start interleaving again once they spill out  
> of their original AGs.  If that's the case then the upcoming  
> filestreams allocator may be your best bet.

I do predict that the filestreams allocator will be useful for us (and
also on my MythTV box...).

The two processes write to their own directories.

The structure of the "filesystem" for the process (ndbd) is:

ndb_1_fs/ (the 1 refers to node id, so there is a ndb_2_fs for a 2 node
setup)
        D8/, D9/, D10/, D11/
                all have a DBLQH subdirectory. In here there are several
                S0.FragLog files (the number changes). These are 16MB
                files used for logging.
                We (currently) don't do any xfsctl allocation on these.
                We should though. In fact, we're writing them in a way
                to get holes (which probably affects performance).
                These files are write only (except during a full cluster
                restart - a very rare event).

        LCP/0/T0F0.Data
                (there is at least 0,1,2 for that first number,
                T0 is table 0 - can be thousands of tables.
                f0 is fragment 0, can be a few of them too, typically
                2-4 though)
                These are an on-disk copy of in-memory tables, variably
                sized files (as big or as small as tables in a DB)
                The above log files are for changes occuring during the
                writing of these files.

        datafile01.dat, undofile01.dat etc
        whatever files the user creates for disk based tables
                the datafiles and undofiles that i've done the special
                allocation for.
                Typical deployments will have anything from a few
                hundred MB per file to few GB to many many GB.

"typical" installations are probably now evenly split between 1 process
per physical machine and several (usually 2). 
-- 
Stewart Smith, Software Engineer
MySQL AB, www.mysql.com
Office: +14082136540 Ext: 6616
VoIP: 6616@xxxxxxxxxxxxxxxx
Mobile: +61 4 3 8844 332

Jumpstart your cluster:
http://www.mysql.com/consulting/packaged/cluster.html

Attachment: signature.asc
Description: This is a digitally signed message part

<Prev in Thread] Current Thread [Next in Thread>