netdev
[Top] [All Lists]

Re: Asynchronous crypto layer.

To: James Morris <jmorris@xxxxxxxxxx>
Subject: Re: Asynchronous crypto layer.
From: Evgeniy Polyakov <johnpol@xxxxxxxxxxx>
Date: Mon, 01 Nov 2004 08:58:59 +0300
Cc: netdev@xxxxxxxxxxx, cryptoapi@xxxxxxxxxxxxxx, "David S. Miller" <davem@xxxxxxxxxxxxx>
In-reply-to: <Xine.LNX.4.44.0410311004290.677-100000@thoron.boston.redhat.com>
Organization: MIPT
References: <Xine.LNX.4.44.0410311004290.677-100000@thoron.boston.redhat.com>
Reply-to: johnpol@xxxxxxxxxxx
Sender: netdev-bounce@xxxxxxxxxxx
On Sun, 2004-10-31 at 19:05, James Morris wrote:
> > Please review and comment, I am open for discussion.
> 
> Here is an initial review, based on the updated patch posted my Michael 
> Ludwig.
> 
> It's great to see this code being written rather than just discussed.
> 
> Have you seen earlier discussions on hardware crypto requirements, some of
> which are summarized at http://samba.org/~jamesm/crypto/hardware_notes.txt
> , and further discussed on the cryptoapi list?

Yes, I have read it and many others.
I believe I've gathered all usefull features and implement them in a
right direction.

> Briefly, the main components required for hardware crypto support are:
> 
> a) Crypto driver API
> 
> This is for registering all types of crypto drivers, including the
> software algorithms and various hardware devices.  A feature of this API
> would be to communicate various capabilities and features of the driver to
> the core crypto code.  Your code appears to at least partially provide
> such an API, although I'm not clear on exactly what it's suitable for all
> types of crypto drivers.  It looks to be heading in the right direction.
> 
> The existing software drivers should be converted to the new API and the 
> old API removed.

Or we can create some kind of bridge between old sync codebase and new
async API. Peaple are using sync API, async can live with it without
problems.

> b) Kernel crypto API
> 
> This refers to the API used by general kernel code which needs to invoke
> crypto processing.  Currently we have a simple synchronous API.
> 
> For hardware support we need an asynchronous API, which for example, would 
> allow a caller to:
> 
>  - allocate a crypto session, specifying preferences, algorithms to be 
>    used etc.
>  - manage crypto session parameters (e.g. key info etc).
>  - batch and submit crypto operations
>  - receive completed operation results

I've put first tree into one function crypto_session_alloc(), although
there are mechanisms to manage crypto session after it is allocated.

> I would imagine that we would deprecate the existing sync API, convert 
> existing users to the async API then eventually remove the sync API.
> 
> This areas needs more analysis and development.
> 
> 
> c) Async engine: scheduling/batching/routing etc.
> 
> This seems to be the core of what you've developed so far.  I'm not sure 
> if we need pluggable load balancers.  How would the sysadmin select the 
> correct one?  The simpler this code is the better.

By following command:
echo -en "simple_lb" > /sys/class/crypto_lb/lbs
or by sending [still not implemented] connector's(netlink) command.

> 
> d) User API via filesystem.
> 
> The user API issue has been discussed on the crypto API list previously.  
> I have outlined some ideas for a pseudo filesytem API, although there are 
> still some issues to be resolved.
> 
> 
> Overall I think it's a good start.  There are some chicken & egg type 
> problems when you don't have GPL drivers, hardware or an existing async 
> API, so I'd imagine that this will all continue to evolve:  with more 
> hardware we can write/incorporate more drivers, with more drivers we can 
> further refine the async support and API etc.

That is true, but I think not all parts of API should be exported as GPL
only.

> Code Review.
> 
> Firstly, please follow Documentation/CodingStyle more closely.  Perhaps 
> run Lindent over your code and have a look at the differences.

I waited this :)

> It's much easier to review and test the code if it is supplied as a patch
> against a recent Linus kernel, rather than a collection of files.  If you 
> update your tree, please regenerate the patch rather than just send new 
> files.

Sure. Files were sent for review, since they can be compiled as stand
alone module.

> 
> Here are some issues that I noticed:
> 
> 
> What is main_crypto_device?  Is this a placeholder for when there are no 
> other devices registered?

main_crypto_device is just virtual device into which all sessions are
placed along with specific crypto device queue.
it is usefull for cases when some driver decides that HW is broken and
marks itself like not working, then scheduler _may_ (simple_lb does not
have such feature) and actually _should_ move all sessions that are
placed into broken device's queue into other devices or drop them(call
callback with special session flag).
It is also usefull for the following situation:
consider situation when several slow devices has it's queues with some
sessions in it, then we add new fast crypto device, bu there are no new
sessions that can be pleced into it. Scheduler may decide to move
sessions from different slow device into new fast one.
All above cases can be easier to implement if we already have one queue
with all sessions in it, and scheduler should not go through all crypto
device's queues.

> Async & sync drivers need to both be treated as crypto drivers.
> 
> 
> static inline void crypto_route_free(struct crypto_route *rt)
> {
>         crypto_device_put(rt->dev);
>         rt->dev = NULL;
>         kfree(rt);
> }
> 
> Why do you set rt->dev to NULL here?  It should not still be referenceable 
> at this point.

crypto_route_free() can be called only from crypto_route_del() which
unlinks given route thus noone can reference rt->dev(but of course
asnyone can have it's private copy, obtained under the lock).
Any route access is guarded by route list lock.

> 
> __crypto_route_del() etc:
> 
> Why are you rolling your own list management and not using the list.h 
> functions?

It is more convenient here to use queue like sk_buf does.

> 
> +struct crypto_device_stat
> +{
> +       __u64                   scompleted;
> +       __u64                   sfinished;
> ...
> 
> Please see how networking stats are implemented (e.g. netif_rx_stats) and 
> previous netdev discussions on 64-bit counters.  Please only use __u64 and 
> similar when exporting structures to userspace.

I use __ prefix for any parameter that is exported to userspace.
I've seen several attempts to convert network statistic to 64bit values,
with backward compatibility it is not an easy task. Let's do not
catch this again.

> Struct ordring:
> 
> +       u16                     priority;
> +
> +       u64                     id;
> +       u64                     dev_id;
> +
> +       u32                     flags;
> 
> The struct will be better aligned if you put the smaller fields first
> (although there may be cases where you want cache-hot items at the front).  
> Also, why the gaps between fields?
> 

All they are created with cache alignment in mind - above cut is
following:
struct crypto_session_initializer
{
        u16                     operation;
        u16                     type;
        u16                     mode;
        u16                     priority;

        u64                     id;
        u64                     dev_id;

        u32                     flags;

        u32                     bdev;

        crypto_callback_t       callback;
};

> 
> __crypto_device_add():
> 
> Please split device initialization and linking into distinct functions.  
> The initialization should probably be done in e.g. crypto_device_alloc() 
> instead of this.

No problem. I gave it such name since __crypto_device_add() is called
only from crypto_device_add() (and main_crypto_device initialisation
routing).

> +       struct crypto_device *__dev, *n;
> 
> Please use a name other than __dev here.  Underscored variable names are 
> generally used for things like local variables in macros, not local 
> temporary variables in functions.

Ok.

> void crypto_device_remove(struct crypto_device *dev):
> 
> Don't loop waiting for the device to become free.  Also, you're scheduling 
> inside a spinlock.  This code needs to be re-thought in general to ensure 
> that a device is always destroyed safely.

No, lock is already freed there.
The sequence is following:
grab the lock.
search given device
unlink it
free the lock // after it nobody can add new sessions to the given
device
wait until device becomes free - it should be called from device's
module_exit() function, thus we are in a process context.
while waiting(device finishes it's sessions processing) we
reschedule itself.
return.

> crypto_lb_unregister():
> 
> I think this has already been raised.  Please don't do things like this.  
> I'm not sure we need loadable load balancers yet, although in any case, 
> if we always need one loaded, then make a default one that cannot be 
> unloaded or replaces the last lb.

We have several IO schedulers.
I believe different hardware setups will requere different load
balancers - for example for setups with only fast devices we can use
just simple_lb, for setups with mixed devices we need to think of
session moving between them.

Simple_lb is fast but it does not have some features, advanced1_lb for
example can have some tricky algorithm to select crypto device and/or 
have some thoughts about session moving...

But I think that we need one unloadable crypto load balancer which will
be used if all others are removed/unused.

I add it to my TODO.

> 
> Where is crypto_session_alloc() used?  It's difficult to evaluate the code
> more deeply without seeing such primitives in use.

I've sent several files in a first e-mail that are called like 
consumer.c slow_consumer.c - it is crypto consumers, actually they 
just call crypto_session_alloc() with appropriate parameters...

> 
> Thanks for doing this work and hope this helps,

Thank you.

> 
> - James
-- 
        Evgeniy Polyakov

Crash is better than data corruption. -- Art Grabowski

Attachment: signature.asc
Description: This is a digitally signed message part

<Prev in Thread] Current Thread [Next in Thread>