tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: adding linux syscall fallocate



Hi Maciej,

Sorry for the delay. I tried to simply return 0 (success) in the linux fallocate syscall I implemented but I got a segfault so it doesn't seem to help:( I didn't have time to chase it further unfortunately but I'll find some time to get a ktruss output and send you soon.

Concerning the fifo trick I also do not know how it manages to bypass the fallocate problem. However, I didn't paste and explain exactly what I'm doing when crosscompiling:

mkfifo mypipe
cat mypipe > myfile &
<execute 32bit crosscompile via ndk using mypipe as output file>

So what I missed last time was that the whole stuff does not hang because I send the cat in the background. Also, the name of the fifo (mypipe) is used as the output file name when calling clang to start the cross compilation. This may shed some light on the matter but I still don't get why it works and I cannot even recall why I got this idea:) The other interesting thing is that fallocate is only an issue when cross compiling for 32bit arm but not when cross compiling for 64bit arm (though a different compiler build -but it is also clang- is used for that in android ndk).

Best regards,
r0ller

-------- Eredeti levél --------
Feladó: Maciej < maciej.grochowski%pm.me@localhost (Link -> mailto:maciej.grochowski%pm.me@localhost) >
Dátum: 2019 november 9 21:30:27
Tárgy: Re: adding linux syscall fallocate
Címzett: r0ller < r0ller%freemail.hu@localhost (Link -> mailto:r0ller%freemail.hu@localhost) >
 
Hi r0ller,
 
"I do know that writing a file by calling fallocate can be tricked and redirected into a named pipe of which the data can simply be pulled into a normal file."
 
> My understanding of both `fallocate` (which can be emulated on not Linux FSes/OSes) and `posix_fallocate` is that this call is made to ensures that the space in the range offset to offset+len is allocated on storage medium.
Based on my experience (and I may not know about all possible use cases as my main background is more Archive/Databases/device drivers), two main use-cases that people use `fallocate`:
- Stability to reserve a place on the disk because bad things can happen in the pipeline of any process if some file is not completed because of the lack of disk space or other issues.
- Various performance tricks reserve contiguous region, mmap it, do large sequential writes or asynchronous operations. Also the simplest performance use case relay on the fact that if you do not have allocated blocks in order to flush write to the disk you need to do read to find blocks and fill metadata.
 
btw: Also people from device drivers used to like fallocation as it makes many driver error scenarios much easier to handle.
 
Now if your cross-compilation don't perform complicated performance tricks or other fallocation operation that I am not aware of (but will be more than happy to learn), we can try to fix that properly.
The `posix_fallocate` implementation that I started a couple of weeks ago should work for the use case where FileSystem mainly focuses to reserve the blocks upfront before placing large assets like binary files, logs or just regular non-sparse files.
If you are interested we can try that, but first of all, could you run the process under the "ktruss -i" and send me the output (you can just grep for `posix_fallocate`, `fdiscard`, `open`, `close` and eventually other FS operations)?
 
 
Coming back to your question: if I understand you correctly you just redirect operations to the FIFO in buffering fashion and then just flush them. I cannot see how this solves fallocation issue, except the fact that fallocate probably does not make too much sense for the cross-compilation process because when someone will try to open the file or do other operation it will be blocked waiting for other processes to fill it.
If we consider a single process which is going to create a file do `falloc` and later try to write to it. At the very beginning, it will either try to remove the previous file if exist or it may do open on the path and hangs waiting for other processes to fill it. I worry that you may face a couple of such scenarios in a more complicated version.
I have some doubts if in that way it will be easy to work around on such a large project as a compilation.
Maybe just mocking fallocate to simply return success without doing anything in FS would be an easier hack, but you will need to make sure that you won't run out of space on your working storage medium...
 
Also before any experiments always do the backup and be sure that you can lose the data that you are working on! ;)
Let me know what do you think
Cheers
Maciej
 
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, November 6, 2019 9:12 AM, r0ller <r0ller%freemail.hu@localhost> wrote:
 
Hi Maciej,
 
Thanks for the detailed answer! Unfortunately, I don't think that I could accomplish this task in my spare time:(
 
Please, don't take this as an offence but as a fix for my case I thought of an utter hack:
 
I do know that writing a file by calling fallocate can be tricked and redirected into a named pipe of which the data can simply be pulled into a normal file. This is what I'm already doing in my project as a workaround when building it as 32bit arm lib:
 
mkfifo mypipe
cat mypipe > myfile
<execute 32bit crosscompile via ndk>
 
The problem with this is that it cannot be used when crosscompiling autoconf projects where a configure script starts creating many files as I'd need to edit the script at too many places to implement this trick.
 
However, if I could carry out this trick with the pipe when intercepting the linux fallocate call, it could work. Do you think it feasible?
 
Best regards,
r0ller
 
-------- Eredeti levél --------
Feladó: Maciej < maciej.grochowski%protonmail.com@localhost (Link -> mailto:maciej.grochowski%protonmail.com@localhost) >
Dátum: 2019 november 4 23:32:56
Tárgy: Re: adding linux syscall fallocate
Címzett: r0ller < r0ller%freemail.hu@localhost (Link -> mailto:r0ller%freemail.hu@localhost) >
 
Hi r0ller,
 
A couple of weeks ago I also run to the issue when I found lack of fallocate or POSIX_FALLOCATE(2) (to be precise) a little bit sad.
>From the very first view typical usage of POSIX_FALLOCATE(2) seems straight forward, comparing to the Linux fallocate(2) where different modes have to be supported. However, things can go a little bit more complicated if you are dealing with an existing file with a more complex structure.
Because of that, I found difficult to provide good quality implementation without a proper way to test it.
Before EuroBSD 2019 as a part of work about fuzzing the FFS, I decided to bring some well known FS test (namely speaking "XFS tests') suit to make sure that bugs that we fix did not introduce a regression.
The same thing applies to the new features of FS, is relatively easy to port implementation from other OS/FS, but without a proper way to test them, I would be very careful to introduce such things too quickly to the end-users.
 
One thing that I was missing for XFS tests, and going to publish part of it soon, is a way to view the internal structure of inodes and other metadata of Filesystem. My primary use case was to debug mount issues, in the example the issue that I showed during my presentation about the fuzzing. But also same apply to the code that manipulates inode blocks.
 
Hopefully, we can elaborate on that, and as I pointed earlier I would be careful with merging new FS features especially such wide used as POSIX_FALLOCATE(2) without a proper FS test suit or extensive enough testing that would require proper too like i.e. FSDB.
 
Thanks
Maciej
 
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Sunday, November 3, 2019 6:06 PM, r0ller <r0ller%freemail.hu@localhost> wrote:
 
Hi Jaromir,
 
Indeed. That's bad news but thanks for your answer! I've even found this: https://wiki.netbsd.org/projects/project/ffs-fallocate/
Are there any details for this project besides that page? I don't know anything about NetBSD internals though if it's not meant for gurus, I'd have a look at it and give it a try.
 
Best regards,
r0ller
 
-------- Eredeti levél --------
Feladó: Jaromír Doleček < jaromir.dolecek%gmail.com@localhost (Link -> mailto:jaromir.dolecek%gmail.com@localhost) >
Dátum: 2019 november 3 15:16:34
Tárgy: Re: adding linux syscall fallocate
Címzett: r0ller < r0ller%freemail.hu@localhost (Link -> mailto:r0ller%freemail.hu@localhost) >
Le dim. 3 nov. 2019 à 08:57, r0ller <r0ller%freemail.hu@localhost> a écrit :
 
> As you can see on the attached screenshot, "line 4741" gets printed out. So I went on to check what happens in VOP_FALLOCATE but it gets really internal there.
>
> Does anyone have any hint?
 
fallocate VOP is not implemented for FFS:
 
> grep fallocate *
ffs_vnops.c: { &vop_fallocate_desc, genfs_eopnotsupp }, /* fallocate */
ffs_vnops.c: { &vop_fallocate_desc, spec_fallocate }, /* fallocate */
ffs_vnops.c: { &vop_fallocate_desc, vn_fifo_bypass }, /* fallocate */
 
Jaromir
 
 

Home | Main Index | Thread Index | Old Index