| To: | Thomas Graf <tgraf@xxxxxxx> |
|---|---|
| Subject: | Re: [RFC] textsearch infrastructure et al v2 |
| From: | Pablo Neira <pablo@xxxxxxxxxxx> |
| Date: | Sat, 28 May 2005 14:56:37 +0200 |
| Cc: | jamal <hadi@xxxxxxxxxx>, netdev@xxxxxxxxxxx |
| In-reply-to: | <20050528123542.GR15391@postel.suug.ch> |
| References: | <20050527224725.GG15391@postel.suug.ch> <1117281581.6251.68.camel@localhost.localdomain> <20050528123542.GR15391@postel.suug.ch> |
| Sender: | netdev-bounce@xxxxxxxxxxx |
| User-agent: | Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20050105 Debian/1.7.5-1 |
Thomas Graf wrote:
* jamal <1117281581.6251.68.camel@xxxxxxxxxxxxxxxxxxxxx> 2005-05-28 07:59 to be frank, i'm still willing to propose some changes to Thomas, I'll do soon since he's pulled my ear with this second rfc request ;). What I remember is that libssearch (or whatever that thing Harald was looking at) did it differently (callback invoked and it return a code which said to continue or not). hm, i don't understand quite well, i bet that libqsearch was already fragment-aware. Anyway the main difference is that libqsearch wasn't designed to be used in user space so, for example, it needed a complete rework to reduce dynamic memory allocations. Also the design should (I think you do already, just double checking) - should be wary of optimizing for a specific algorithmn; i see you have KMP but not Boyer-Moore for example and i am not sure what the repurcassions of above approach are etc etc. For small pattern and long packets, such pattern searching on the fragment borders doesn't really hurt the performance. Anyway the matter of having several algorithms will let users choose that one that suits better their needs. boyer-moore (BM) and other variants are definitely a must to have. I'm still reading some papers about string matching and practical applications (p2p traffic recognition based on string matching, ids, etc etc) and the most interesting practical results come always from the use of BM friends. I think the best approach would be to first linearize then search.
Netfilter used to follow this approach in early 2.6 kernels and Patrick McHardy demostrated with some oprofile stuff that skb_copy_bits decreased performance. I'm not familiar with those problems jamal has mentioned though, could they really affect the string matching infrastructure in some way? I truly prefer avoiding linearizing skb's. The only other comment i have is on the patch you called naive regexp; I think you should probably call it something else instead of regexp since you invented it.
-- Pablo |
| Previous by Date: | Re: [RFC] textsearch infrastructure et al v2, Thomas Graf |
|---|---|
| Next by Date: | Re: [RFC] textsearch infrastructure et al v2, Pablo Neira |
| Previous by Thread: | Re: [RFC] textsearch infrastructure et al v2, Thomas Graf |
| Next by Thread: | Re: [RFC] textsearch infrastructure et al v2, Pablo Neira |
| Indexes: | [Date] [Thread] [Top] [All Lists] |