Subject: Re: tmpfs: Storing file contents
To: Julio M. Merino Vidal <jmmv84@gmail.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 08/12/2005 09:24:01
--pf9I7BMVVzbSWLtt
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Aug 12, 2005 at 12:27:56PM +0200, Julio M. Merino Vidal wrote:
> Hi all,
>=20
> during the last days, I've implemented very clumsy read/write vnode
> operations for tmpfs, just to familiarize myself with uiomove and the
> like (they are not in the CVS).  I've also been reading the first chapters
> of Cranor's UVM dissertation to see how to manage anonymous
> memory, although I still have many doubts.
>=20
> The thing is that I don't know how to store file contents in memory
> to comply to the following requirements:
> - Use pageable (unwired) memory (i.e., anonymous memory, right?).
> - Avoid multiple copies of the same data on memory.

You can do this by making the VM and read/write interfaces use the same=20
storage. If you use the normal read & write routines that use=20
memory-mapped pages for i/o, you're fine.

> - Be careful to not introduce a lot of overhead on the memory manager.
>=20
> I see two possibilities as regards how to manage each file:
>=20
> - One of them is to allocate space page by page (just as a file-system
>   allocates blocks) and attach these pages to the files needing them.
>   The fact that they are pages does not matter in the idea: we'd simply
>   be dealing with a structure whose size matches a page, but that's all.
>   That is, keep a sorted list (or another structure with better random
>   access times) of pages (blocks) that contain the file.  This'd be like
>   having a pool of fixed size structures, but over pageable memory.
>   In order to do this, I'd be nice if I could allocate a big address space
>   and allocate/deallocate individual pages easily.

My concern with this is that you now have files coming out of a backing=20
store. So you've recreated a traditional file system, just with the=20
backing in pageable memory.

> - Have an independent virtual address space for each file, backed by
>   anonymous memory, so that reads and writes are trivial: just read
>   and write from memory at a specific offset within the address
>   space.  Page mapping could be automatic upon faults (extra steps
>   could be needed to unmap unused pages to avoid swap leaks).
>   I don't know if this implementation is possible at all, or if it could
>   cause too much overhead on UVM... but if possible, it'd simplify
>   things a lot.

I _think_ this is the right way to handle things, though the Chucks and=20
Jason will know better. Maybe your #1 way is better, but I like this one.=
=20
;-)

> - Of course, there may be other better possibilities, but I can't see
>   them with my current knowledge.

I think those are the two main ideas. The differentiator would be access=20
abilities and meory use. Memory use in the extra structures needed for=20
each method (more aobjs vs whatever maps a file to/from the blocks in the=
=20
one aobj of option 1), and access abilities for the locking (each aobj=20
will be lockable on its own, but there will need to be some shared locking=
=20
on option 1).

> The problem is that... aside not knowing which approach should I
> follow, I don't know how to code them as regards memory
> management.  I've read the uvm(9) manpage and searched for
> usage examples of its functions within the kernel.  Unfortunately,
> I can't find much: just some calls in process management and
> SHM... but these don't seem to be what I need (or at least I can't
> find the appropriate examples).
>=20
> Some of my doubts are...
> - Do I have to keep an aobj for each file?  If so, aobj's are created
>   with uao_create, right?  Which size should they have (as it's not
>   known beforehand)?

Well, I don't think you HAVE to keep an aobj for each file, I just think=20
it's the best idea. :-) I'm not sure about what size it should be, but=20
you'll have to be able to grow it anyway...

> - Once I have an aobj, how do I map space within it?  Maybe I have
>   to use uvm_map, but I'm afraid that calling that function to request
>   single pages could introduce a lot of overhead...  Or can it be
>   done in a more automated way, as I described in the second
>   approach above?
> - When some people mentioned that I should avoid having multiple
>   copies of the same data in memory, they were referring to one
>   copy of the data managed by the filesystem and another one
>   stored inside the vnode's uobj, right?  If so, can't this be avoided
>   by having a getpages operation that simply loans pages from the
>   filesystem to the vnode (thus just keeping one real copy)?

Exactly. That's the point of UBC.

> I'm sorry for so many questions, but I'm really lost in this area.
> I will appreciate any suggestion, pointers to documentation or code,
> or even detailed explanations of the process I should follow (e.g.,
> which is the basic idea to store the files, which functions should I
> look at, etc.)...

Hope this helps.

Take care,

Bill

--pf9I7BMVVzbSWLtt
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFC/M0hWz+3JHUci9cRAiFIAJ9hD4LR6zznGuW7HmQbk/9yUCRTKQCeLHSr
EpQzddtNhKmIq2eMn1K/8BA=
=gYuW
-----END PGP SIGNATURE-----

--pf9I7BMVVzbSWLtt--