xfs
[Top] [All Lists]

Need Advice replacing MySQL with XFS

To: linux-xfs@xxxxxxxxxxx
Subject: Need Advice replacing MySQL with XFS
From: Douglass Judd <doug@xxxxxxxxxxxxxx>
Date: Thu, 23 Jun 2005 11:19:55 -0700
Sender: linux-xfs-bounce@xxxxxxxxxxx
I've written a web crawler (distributed across 16 machines) using MySQL
(INNODB) as the underlying document store.  I'm using MySQL for its
b-tree index to quickly access each document using the MD5 hash of the
URL as the key.  I'm considering replacing the MySQL table that I use as
the document repository with a simple directory heirarchy in XFS.  Here
are my basic requirements:

1. Each machine will start out with about 20 Million documents but needs
to be able to grow to 100 Million documents, each document is about 10KB
in size.

2. Given a key (16 byte hash value), need to be able to efficiently
insert documents and locate/update existing documents.

3. Need to be able to sequentially scan the entire repository, exporting
all of the documents for the next stage of processing.

First of all, is this even doable?  Is there an optimal number of
entries per directory for best performance?

Any advice would be greatly appreciated.

- Doug



<Prev in Thread] Current Thread [Next in Thread>
  • Need Advice replacing MySQL with XFS, Douglass Judd <=