-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opt] use scandir to shorten initialization time of Inotify #45
Comments
If you allow, i am glad to make a pull request~ |
@NoneGG I implemented the tree handling in #48 using os.walk() instead, which is 4 times faster on rotational media, and about twice as fast on an SSD. That is, if you are using python >= 3.5. It would be interesting to see a comparison with raw os.scandir(), but I think you can't squeeze much more out of it. os.walk() only does a few extra-operations that aren't of interest to InotifyTree, and neither are I/O-bound. Unless you need support for python < 3.5, of course, then you could depend on scandir, and do the suggested try:
from os import scandir
except ImportError:
from scandir import scandir If I may make a suggestion though, I think your approach is not the best for your situation, architecturally speaking. Given such a huge tree of files, it's preferable to hook into the code creating / modifying the video files, and callback some handler on the other side -- maybe some API exposed by the code interested in change events. If there are multiple parties interested in changes, then a message queue / fanout system would simplify things greatly. |
@xlotlu Thank you for your response~ As far as i know, Actually the monitor base on inotify is designed for both human operation mistake and code mistake and is still in development now. Your suggestion sounds reasonable and we do have API and subscribing mechanism. But if we need to take monitor on human operation, a hook in file system level is needed, that's why we choose inotify. Could you tell me why |
@NoneGG yes, on python < 3.5 it is just as slow as before. I made some benchmarks which you can find attached to the PR. If you need < 3.5 support, then you need to depend on
I see. I didn't imagine you'd have arbitrary, human-driven modifications. If so you have no other option, short of making sure all those modifications go through a custom application.
I didn't say that - a huge tree of files is probably the best way to handle your storage needs. It's the Maybe you could approach this from the other direction: monitor for "everything", and filter out the events that you're interested in? The kernel's audit system comes to mind, and it can monitor specific paths. There's also fanotify, but I don't think it fits your requirements. |
@xlotlu According to experiment in our CDN server, it do takes long time to initialize inotify tree(that's why i open this issue), but when refering to CPU and memory, it does not take much indeed.(i am not so sure, i only use |
I want to use inotify to monitor video files on cdn server, and it takes too long time to initialize the InotifyTrees when i run the demo script (about 1061070 files)
I notice that
os.listdir
is used in the code, is there any possibility that we can usescandir.listdir
(it is said that scandir will be merged to Python3 official in next release) to optimize the initialization speed?The text was updated successfully, but these errors were encountered: